diff -Nuar linux-2.4.17.SuSE/Documentation/Configure.help linux-2.4.17.SuSE.imon/Documentation/Configure.help --- linux-2.4.17.SuSE/Documentation/Configure.help Wed Jan 23 15:19:26 2002 +++ linux-2.4.17.SuSE.imon/Documentation/Configure.help Sun Feb 3 20:19:17 2002 @@ -15963,6 +15963,22 @@ input/output character sets. Say Y here for the UTF-8 encoding of the Unicode/ISO9646 universal character set. +Inode Monitor (imon) support (EXPERIMENTAL) +CONFIG_IMON + This enables support for imon, the inode monitor. Through imon + and fam, the File Alteration Monitor, programs can express interest + in individual files and directories, and the kernel will notify those + programs when the files change. This enables desktops, mail readers, + administration tools, etc. to respond to changes in the system + immediately. If you don't enable imon, such programs will have to + poll files (by checking them every few seconds) to determine whether + they've changed. + + See http://oss.sgi.com/projects/fam/ for more information on imon. + + If you don't know whether you want imon, it doesn't do any harm to + say N here. (If you want imon, say M, not Y.) + Virtual terminal CONFIG_VT If you say Y here, you will get support for terminal devices with diff -Nuar linux-2.4.17.SuSE/Documentation/imon.txt linux-2.4.17.SuSE.imon/Documentation/imon.txt --- linux-2.4.17.SuSE/Documentation/imon.txt Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/Documentation/imon.txt Sun Feb 3 20:19:17 2002 @@ -0,0 +1,216 @@ +These notes on imon consist of the following sections: + + 0. What's in this patch + 1. How imon works + 2. How imon should work + 3. The relationship between fam & imon + 4. Open/unresolved imon issues + 5. imon design flames + +Sections 1, 2, and 4 are probably the most interesting. + +Send feedback to fam@oss.sgi.com or rusty@sgi.com. + +imon was originally built by Wiltse Carpenter and/or Bruce Karsh on IRIX. It +was ported to Linux by Roger Chickering. A few additional bits, such as +portions of these notes, may have been done by Rusty Ballinger, but good luck +getting that guy to admit to anything. + + + +0. What's in this patch + +This patch is for 2.4.0-test9 kernels. It is EXPERIMENTAL, ONLY SLIGHTLY TESTED, +and portions of it should be rewritten; see section 2, "How imon should work." + +Modified files: + +Documentation/Configure.help added documentation for CONFIG_IMON +fs/Config.in added "source fs/imon/Config.in" +fs/Makefile added chunk for building imon +fs/attr.c added IMON_EVENT in notify_change, which should + be removed +fs/exec.c added CONFIG_EXECOUNT & IMON_EVENT in + do_execve; I'm not sure whether this can be + removed +fs/filesystems.c added init_imon(), which is only used if imon + is compiled in rather than being a module? + Can this be removed? imon should probably + always be a module. +fs/namei.c added IMON_EVENT in various functions, all of + which should be removed +fs/read_write.c added IMON_EVENT in sys_write and sys_writev, + which should be removed +fs/super.c (no change - should have added unmount + notification?) +include/linux/fs.h added unsigned int i_execount to inode struct +include/linux/sched.h added struct dentry *script to task_struct +kernel/exit.c added CONFIG_EXECOUNT stuff in do_exit +kernel/fork.c added CONFIG_EXECOUNT stuff in do_fork +kernel/ksyms.c added global imon symbols, which should be + removed as soon as possible + +New files: + +Documentation/imon.txt +fs/imon/Config.in +fs/imon/Makefile +fs/imon/imon.c +fs/imon/imon_static.c +include/linux/imon.h + + + +1. How imon works + +When a process opens /dev/imon and uses the IMONIOC_EXPRESS ioctl to express +interest in directories & files, imon sticks those interests in a hash table. +Various file operations conclude with a call to IMON_EVENT() or IMON_BROADCAST, +which are macros which call imon_event() or imon_broadcast() if the global +flag imon_enabled is set. + +Inside of imon_event, if the dev/inode on which the event occurred is in the +hash table of expressed interests, the event is put in the event queue, which +the client process can read() from /dev/imon. When imon has no clients, the +imon_enabled flag is cleared, and file operations do not involve imon. (In a +kernel configured without imon support, the IMON_EVENT and IMON_BROADCAST +macros are no-ops.) + +The imon changes also include support for tracking how many instances of an +executable are currently running. This is done by adding to the inode struct a +count of the number of processes executing it. Roger's notes: + + When exec is called, the inode corresponding to the file being exec'd + has its i_execount field incremented, while the i_execount for the + previous inode being executed is decremented. The inode is kept track + of in the per-task (process) field called script (which is actually a + struct dentry which contains the inode of interest). + + The reason the field is called script is because it can be used to + keep track of exec'd shell scripts. The exec code in the kernel + actually sets up the process as if the shell were run instead of the + shell script, and the script/i_execount logic is used to remember the + fact that the shell script was run so that can be represented to the + user. I originally tried to implement IMON_EXEC/IMON_EXIT by looking + at memory regions mapped with execute permission, but this did not + work for shell scripts. + + Bookkeeping for the i_execount field is done in three places: + (1) In exec, as described above + (2) In fork, i_execount must be incremented, because the child process + is still executing the same inode + (3) In exit, i_execount is decremented. + + exec and exit notify imon when the flag gains or loses a value of 0. + + The i_execount field of struct inode and the script field of struct + task_struct are not #ifdef CONFIG'd out because I didn't want to + introduce problems with different .o files having different notions of + the sizes of these very important and fundamental data structures. + + + +2. How imon should work + +When imon beings monitoring an inode, it should replace that inode's table of +file & inode operations with its own tables of functions that call the original +functions and then post the appropriate event on the queue. This way, the only +file operations intercepted by imon are on files that imon is monitoring. + +As it is now, as long as imon is monitoring at least one file, operations on +*any* file cause imon to check its hash table to determine whether the +operation was on a monitored file. This is not so good both because of the +unnecessary work being performed, and because of the IMON_EVENT() macro calls +which must be scattered throughout the filesystem code. (If this is changed, +then hopefully there won't be any imon code anywhere but the imon module.) + +It was done this way for a couple of reasons. A big one is that Linux has a +lot of "weird" filesystems which might perform their own alteration of the +inode operations, and there wasn't a good way to keep them & imon from stomping +each other. + +It would be nice if someone who's familiar with the filesystem code would spend +an evening or two and fix it. (I think it would take me longer than that, and +the quality of the results would be questionable.) + + + +3. The relationship between fam & imon + +fam, the File Alteration Monitor, is a user-level daemon; it's the way client +applications talk to imon. All client requests go to fam; fam determines which +requests to pass on to imon, reads events from /dev/imon, and passes them on to +the appropriate clients. On top of the local inode monitoring performed by +imon, fam provides the following services: + + - monitoring remote files + - serving remote requests for local files + - handling interest relocation when filesystems are unmounted + - monitoring deleted files + - monitoring local files when the kernel isn't configured for imon support + +When fam is running on a system which doesn't have kernel support for imon, it +polls files. This means that an application which uses the fam API will work +even on systems which do not have imon support enabled in the kernel. (The +difference is that there will be more latency between file operations & event +notification. Also, EXEC and EXIT events will not be delivered, but most +applications don't require this.) + + + +4. Open/unresolved imon issues + +All of these issues are trivia compared to the way imon hooks into the +filesystem. (See above.) + +- If the IMON_ATTR event type is not needed, it should be removed from imon.h. + It may be needed for famming individual files (or getting notification about + a fammed directory getting deleted). + +- Unmount notification has not been tested, and therefore probably does not + work. XXX YOU NEED TO VERIFY whether a kdev_t can be treated as a dev_t. + +- [This may no longer be true; it was happening on 2.2.13.] On my system, + every second I get a bunch of seemingly-bad events on device 0 with the i_sb + or i_sb->s_type == NULL. (Is this /proc? Should those events be + discarded?) According to kdev_t.h, device 0 is "no device," so perhaps I + should discard them without checking the hash table. + +- I think the reorganize-collision has a logic error which may affect + performance. Here are my notes, which may no longer be correct: + + 0 1 2 3 + Start + Insert 3a 3a + Insert 3b 3b 3a* + Insert 2a 2a 3b* 3a* + Insert 3c 3c 2a* 3b* 3a* + Delete 3b 3c 2a* 3a* + Delete 3a 2a* 3c + At this point, you have 2a incorrectly marked as a collision. What is the + harm in this? It could result in slightly slower probes in some cases? Is + that all? + + 0 1 2 3 + Start + Insert 3a 3a + Insert 3b 3b 3a* + Insert 1a 1a 3b 3a* + At this point, if you delete 3a, I think you will traverse more of the hash + table than you need to, trying to reorganize elements which are already + correctly hashed. How bad is this? + + + +5. imon design flames + +Sorry this section isn't as interesting as its title implies. I was going to +add some of the exchange here about whether imon should be an exclusive driver +(as it is), or whether it should allow any number of clients to open /dev/imon +at once. I did actually have a version which worked that way (it was pretty +neat), but when I told other people about it, they presented some compelling +arguments about why it was better to have it implemented the way it was. I may +add that argument here, as it will save people some time in the future if they +want to repeat it. + + diff -Nuar linux-2.4.17.SuSE/fs/Config.in linux-2.4.17.SuSE.imon/fs/Config.in --- linux-2.4.17.SuSE/fs/Config.in Wed Jan 23 15:19:25 2002 +++ linux-2.4.17.SuSE.imon/fs/Config.in Sun Feb 3 20:19:17 2002 @@ -169,4 +169,5 @@ source fs/partitions/Config.in endmenu source fs/nls/Config.in +source fs/imon/Config.in endmenu diff -Nuar linux-2.4.17.SuSE/fs/Makefile linux-2.4.17.SuSE.imon/fs/Makefile --- linux-2.4.17.SuSE/fs/Makefile Wed Jan 23 15:19:25 2002 +++ linux-2.4.17.SuSE.imon/fs/Makefile Sun Feb 3 20:23:27 2002 @@ -22,6 +22,7 @@ obj-y += noquot.o endif + subdir-$(CONFIG_PROC_FS) += proc subdir-y += partitions @@ -67,6 +68,7 @@ subdir-$(CONFIG_REISERFS_FS) += reiserfs subdir-$(CONFIG_DEVPTS_FS) += devpts subdir-$(CONFIG_SUN_OPENPROMFS) += openpromfs +subdir.$(CONFIG_IMON) += imon subdir-$(CONFIG_JFS_FS) += jfs @@ -75,11 +77,15 @@ obj-$(CONFIG_BINFMT_COFF) += binfmt_coff.o obj-$(CONFIG_BINFMT_EM86) += binfmt_em86.o obj-$(CONFIG_BINFMT_MISC) += binfmt_misc.o +obj-$(CONFIG_IMON) += imon/imon.o # binfmt_script is always there obj-y += binfmt_script.o obj-$(CONFIG_BINFMT_ELF) += binfmt_elf.o + +# so is imon +obj-y += imon/imon_static.o # persistent filesystems obj-y += $(join $(subdir-y),$(subdir-y:%=/%.o)) diff -Nuar linux-2.4.17.SuSE/fs/attr.c linux-2.4.17.SuSE.imon/fs/attr.c --- linux-2.4.17.SuSE/fs/attr.c Thu Oct 11 18:43:30 2001 +++ linux-2.4.17.SuSE.imon/fs/attr.c Sun Feb 3 20:19:17 2002 @@ -9,6 +9,7 @@ #include #include #include +#include /* this can go away when imon is done */ #include #include #include @@ -146,5 +147,6 @@ if (dn_mask) inode_dir_notify(dentry->d_parent->d_inode, dn_mask); } + IMON_EVENT_NOERR(error, inode, IMON_CONTENT); return error; } diff -Nuar linux-2.4.17.SuSE/fs/exec.c linux-2.4.17.SuSE.imon/fs/exec.c --- linux-2.4.17.SuSE/fs/exec.c Fri Dec 21 18:41:55 2001 +++ linux-2.4.17.SuSE.imon/fs/exec.c Sun Feb 3 20:19:17 2002 @@ -37,6 +37,7 @@ #include #define __NO_VERSION__ #include +#include /* this might be able to go away when imon is done */ #include #include @@ -859,6 +860,9 @@ struct file *file; int retval; int i; +#ifdef CONFIG_EXECOUNT + struct dentry * script; +#endif file = open_exec(filename); @@ -886,6 +890,9 @@ return bprm.envc; } +#ifdef CONFIG_EXECOUNT + script = dget(file->f_dentry); +#endif retval = prepare_binprm(&bprm); if (retval < 0) goto out; @@ -904,11 +911,28 @@ goto out; retval = search_binary_handler(&bprm,regs); - if (retval >= 0) + if (retval >= 0) { /* execve success */ +#ifdef CONFIG_EXECOUNT + if (current->script) { + if (--(current->script->d_inode->i_execount) == 0) { + IMON_EVENT(current->script->d_inode, IMON_EXIT); } + dput(current->script); + } + current->script = script; + if (script->d_inode->i_execount++ == 0) { + IMON_EVENT(script->d_inode, IMON_EXEC); + } +#endif return retval; + } out: + +#ifdef CONFIG_EXECOUNT + dput(script); +#endif + /* Something went wrong, return the inode and free the argument pages*/ allow_write_access(bprm.file); if (bprm.file) diff -Nuar linux-2.4.17.SuSE/fs/imon/Config.in linux-2.4.17.SuSE.imon/fs/imon/Config.in --- linux-2.4.17.SuSE/fs/imon/Config.in Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/fs/imon/Config.in Sun Feb 3 20:19:17 2002 @@ -0,0 +1,10 @@ +# +# imon configuration +# + +comment 'Inode monitor support (experimental)' +tristate 'Inode monitor support (experimental)' CONFIG_IMON + +if [ "$CONFIG_IMON" != "n" ]; then + define_bool CONFIG_EXECOUNT y +fi diff -Nuar linux-2.4.17.SuSE/fs/imon/Makefile linux-2.4.17.SuSE.imon/fs/imon/Makefile --- linux-2.4.17.SuSE/fs/imon/Makefile Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/fs/imon/Makefile Sun Feb 3 20:19:17 2002 @@ -0,0 +1,16 @@ +# +# Makefile for the Linux imon routines. +# +# Note! Dependencies are done automagically by 'make dep', which also +# removes any old dependencies. DON'T put your own dependencies here +# unless it's something special (not a .c file). +# +# Note 2! The CFLAGS definitions are now in the main makefile. + +O_TARGET := imon.o + +obj-y := imon.o imon_static.o +obj-m := $(O_TARGET) + +include $(TOPDIR)/Rules.make + diff -Nuar linux-2.4.17.SuSE/fs/imon/README linux-2.4.17.SuSE.imon/fs/imon/README --- linux-2.4.17.SuSE/fs/imon/README Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/fs/imon/README Sun Feb 3 20:19:17 2002 @@ -0,0 +1,6 @@ + + + See Documentation/imon.txt for notes on imon, + what it's for, and how it's broken. + + diff -Nuar linux-2.4.17.SuSE/fs/imon/imon.c linux-2.4.17.SuSE.imon/fs/imon/imon.c --- linux-2.4.17.SuSE/fs/imon/imon.c Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/fs/imon/imon.c Sun Feb 3 20:19:17 2002 @@ -0,0 +1,1320 @@ +/* + imon - inode monitor pseudo device + Copyright (C) 1999 Silicon Graphics, Inc. All Rights Reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of version 2 of the GNU General Public License as + published by the Free Software Foundation. + + This program is distributed in the hope that it would be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Further, any + license provided herein, whether implied or otherwise, is limited to + this program in accordance with the express provisions of the GNU + General Public License. Patent licenses, if any, provided herein do not + apply to combinations of this program with other product or programs, or + any other product whatsoever. This program is distributed without any + warranty that the program is delivered free of the rightful claim of any + third person by way of infringement or the like. See the GNU General + Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston MA 02111-1307, USA. +*/ + +#include +#include +#include +#include +#include +#include +#include +#include +#include /* only needed if DEBUG_INTO_PROC is set below */ +#include +/*#include */ +#include +#include + +/* If this is set, imon will never receive fs events. */ +#define DISABLE_EVENTS 0 + +/* If this is set, debugging ioctls will be serviced. */ +#define ALLOW_DEBUG_IOCTLS 0 + +/* If this is set, debugging files in /proc will be available. */ +#define DEBUG_INTO_PROC 1 + +/* If this is defined, metering information will be gathered. There's no + * point in turning this on if you don't also turn on DEBUG_INTO_PROC. */ +#define METER_ON 1 + +#if METER_ON +/* Metering statistics */ +typedef struct meter { + u_long im_lookups; /* hash table lookups */ + u_long im_probes; /* linear probes in hash table */ + u_long im_hits; /* lookups which found key */ + u_long im_misses; /* keys not found */ + u_long im_adds; /* entries added */ + u_long im_grows; /* table expansions */ + u_long im_shrinks; /* table compressions */ + u_long im_deletes; /* entries deleted */ + u_long im_dprobes; /* entries hashed while deleting */ + u_long im_dmoves; /* entries moved while deleting */ +} meter_t; +#define METER(expr) expr +#else +#define METER(expr) +#endif /* METER_ON */ + +#ifndef ASSERT +#define ASSERT(expression) { \ + if (!(expression)) { \ + (void)panic( \ + "assertion \"%s\" failed: file \"%s\", line %d\n", \ + #expression, \ + __FILE__, __LINE__); \ + } \ +} +#endif + +/* queue functions */ +static int dequeue(int, qelem_t *); +static void enqueue(dev_t dev, ino_t ino, intmask_t); +static int initqueue(void); +static void freequeue(void); +static int imonqtest(void); + +/* queue structure is a circular array with pointers to head and tail */ +static struct { + qelem_t *q_head; /* next qe to send to user */ + qelem_t *q_tail; /* next avail qe to add an event */ + qelem_t *q_base; /* first element in qe array */ + qelem_t *q_last; /* last element in qe array */ + int q_thresh; /* quick wakeup threshold */ + struct timer_list q_timer; /* timeout id */ + int q_count; /* number of events in q */ + char q_tpending; /* timeout pending */ + char q_over; /* overflow flag */ + char q_wanted; /* user is sleeping */ +} ino_ev_q; + + +/* + * Interest hash table + */ +typedef struct inthash { + u_int ih_shift; /* log2(hash table size) */ + u_int ih_numfree; /* count of free entries */ + qelem_t *ih_base; /* dynamically allocated hash table */ + qelem_t *ih_limit; /* end of hash table */ +#if METER_ON + meter_t ih_meters; +#endif +} inthash_t; + +/* hash functions */ +static void clearhash(void); +static void hashremove(dev_t, ino_t, intmask_t); +static int hashinsert(dev_t, ino_t, intmask_t); +static intmask_t probehash(dev_t, ino_t); +static inthash_t imon_htable; + +static void __imon_event(struct inode *inode, int event); +static void __imon_broadcast(dev_t dev, int event); + +/* lock guarding event queue */ +static spinlock_t qlock; + +/* mutex guarding interest hash table */ +static DECLARE_MUTEX(hashsema); + +/* guards against simultaneous open/close */ +static DECLARE_MUTEX(opensema); + +/* for notification of events */ +static wait_queue_head_t imon_wait; + +/* major number of imon device */ +static int imon_major; +MODULE_PARM(imon_major, "i"); + +/* module name */ +static char *imon_name = IMON_NAME; +MODULE_PARM(imon_name, "s"); + +/* file operations for imon device */ +static ssize_t imon_read(struct file *file, char *buf, + size_t length, loff_t *offset); +static unsigned int imon_poll(struct file *file, + struct poll_table_struct *table); +static int imon_ioctl(struct inode *inode, struct file *file, + unsigned int cmd, unsigned long arg); +static int imon_open(struct inode *inode, struct file *file); +static int imon_release(struct inode *inode, struct file *file); + +static struct file_operations imon_fops = { + read: imon_read, + poll: imon_poll, + ioctl: imon_ioctl, + open: imon_open, + release: imon_release, +}; + +#if DEBUG_INTO_PROC +static int imon_hash_proc_info(char *, char **, off_t, int); +static int imon_meter_proc_info(char *, char **, off_t, int); +#endif /* DEBUG_INTO_PROC */ + +/* + * init_imon + * + * Description: + * Initialize qlock, get a major number for the imon device, & register + * debugging files in /proc. + * + * Returns: + * 0 if successful, -errno if error. + */ +int __init init_imon(void) +{ + int result; + + printk(KERN_INFO "%s (inode monitor), $Revision: $\n", imon_name); + init_waitqueue_head(&imon_wait); + spin_lock_init(&qlock); + + result = register_chrdev(imon_major, imon_name, &imon_fops); + if (result < 0) { + printk(KERN_WARNING "%s: can't get major %d\n", imon_name, + imon_major); + return result; + } + if (imon_major == 0) { + imon_major = result; + } +#if DEBUG_INTO_PROC + create_proc_info_entry("imon-hash", 0, proc_root_fs, + imon_hash_proc_info); + create_proc_info_entry("imon-meter", 0, proc_root_fs, + imon_meter_proc_info); +#endif + return 0; +} + + +#ifdef MODULE + +/* + * init_module + * + * Description: + * Called when imon module is loaded. Get a major number + * for the imon device. + * + * Returns: + * 0 if successful, -errno if error. + */ +/* int init_module(void) +{ + return init_imon(); +} +*/ +/* + * cleanup_module + * + * Description: + * Called when imon module is unloaded. Release major number. + */ +void cleanup_module(void) +{ +#if DEBUG_INTO_PROC + remove_proc_entry("imon-hash", proc_root_fs); + remove_proc_entry("imon-meter", proc_root_fs); +#endif + unregister_chrdev(imon_major, imon_name); +} + +#endif /* MODULE */ + + +/* + * imon_ioctl + * + * Description: + * imon's ioctl method lets the caller express or revoke interest + * in a file, and check for whether events are pending. + * + * Parameters: + * inode imon device inode. + * file imon device file. + * cmd command (IMONIOC_QTEST, IMONIOC_EXPRESS, or IMONIOC_REVOKE) + * args ioctl arguments. + * + * Returns: + * 0 if successful, -errno if error. + */ +/* ARGSUSED */ +static int +imon_ioctl(struct inode *inode, struct file *file, + unsigned int cmd, unsigned long args) +{ + interest_t in; + int error = 0; + revoke_t revoke; + struct nameidata nd; + struct inode *ip; + + + switch (cmd) { + + /* + * Express interest in a file and optionally return that + * file's stat structure. Multiple expressions have their + * interests or'd together. + */ + case IMONIOC_EXPRESS: + /* get and check user request */ + if (copy_from_user(&in, (void*)args, sizeof(interest_t)) > 0) { + return -EFAULT; + } + if (!in.in_what || (in.in_what & ~IMON_USERMASK)) { + return -EINVAL; + } + error = user_path_walk(in.in_fname, &nd); + if (error) { + return error; + } + ip = nd.dentry->d_inode; + if (ip == NULL) { + path_release(&nd); + return -ENOENT; + } + + /* if user wants a stat too, get it now. */ + if (in.in_sb) { + famstat_t famstat; + famstat.st_dev = ip->i_dev; + famstat.st_ino = ip->i_ino; + if (copy_to_user(in.in_sb, &famstat, + sizeof(famstat)) > 0) { + path_release(&nd); + return -EFAULT; + } + } + + /* insert interest in inode/dev pair in hash table */ + error = hashinsert(ip->i_dev, ip->i_ino, in.in_what); + if (!error) { + /* Check if it is currently running. */ + if (ip->i_execount) { + enqueue(ip->i_dev, ip->i_ino, IMON_EXEC); + } + } + + /* done with inode for now */ + path_release(&nd); + return error; + + /* + * Revoke interest in a dev/inode pair. Only the interests + * specified by the in_what field are revoked. When the last + * interest in a dev/inode pair is revoked it is removed from + * the hash table. + */ + case IMONIOC_REVOKE: + if (copy_from_user(&revoke, (void*)args, sizeof revoke) > 0) { + return -EFAULT; + } + if (!revoke.rv_what || (revoke.rv_what & ~IMON_USERMASK)) + return -EINVAL; + hashremove(revoke.rv_dev, revoke.rv_ino, revoke.rv_what); + break; + /* + * Qtest returns non-zero when there are events available + * in the queue. + */ + case IMONIOC_QTEST: + return imonqtest(); + break; + + +#if ALLOW_DEBUG_IOCTLS + + /* + * This calls MOD_DEC_USE_COUNT until imon is marked as being used + * by only one module (the one making the call, which should then + * exit). This is so that you can replace the module if one of + * your tests croak without calling MOD_DEC_USE_COUNT. + */ + case IMONIOC_RESET: + printk(KERN_DEBUG "IMONIOC_RESET: MOD_IN_USE == %d\n", MOD_IN_USE); + while(MOD_IN_USE > 1) MOD_DEC_USE_COUNT; + break; + + /* + * This disables imon events, in case something goes haywire + * during development. + */ + case IMONIOC_DISABLE: + printk(KERN_DEBUG "IMONIOC_DISABLE: disabling imon\n"); + imon_enabled = 0; + imon_event = NULL; + imon_broadcast = NULL; + break; + +#endif /* ALLOW_DEBUG_IOCTLS */ + + default: + return -EINVAL; + } + return 0; +} + +/* + * imon_read + * + * Description: + * Read events off of the queue. + * + * Reads as many events as will fit in user's buffer. Normally + * waits for at least one event and then reads as many as are + * available without blocking. If the device is opened in + * non-delay mode, then it will return immediately with EAGAIN. + * + * Parameters: + * buf Gets data read. + * length number of bytes to read. + * offset ignored + * + * Returns: + * Number of bytes read if successful, -errno if error. + */ +/* ARGSUSED2 */ +ssize_t +imon_read(struct file *file, char *buf, size_t length, loff_t *offset) +{ + int nread = 0, sleepok; + qelem_t qe; + + if (length < sizeof(qelem_t)) { + return -EINVAL; + } + + sleepok = !(file->f_flags & (O_NONBLOCK|O_NDELAY)); + while (length >= sizeof qe) { + if (dequeue(sleepok, &qe) < 0) { + break; + } + if (copy_to_user(buf, (void*)&qe, sizeof qe) > 0) { + return -EFAULT; + } + buf += sizeof qe; + nread += sizeof qe; + length -= sizeof qe; + sleepok = 0; + } + + if (nread == 0 && !sleepok) { + return -EWOULDBLOCK; + } + + return nread; +} + +/* + * imon_poll + * + * Description: + * Called by select and poll system call code to see what's + * available. Imon only provides select notification for read. + * + * Parameters: + * file imon device + * table stuff for putting process to sleep if no data is + * available. + * + * Returns: + * Flags indicating what kind of data is available. + */ +static unsigned int +imon_poll(struct file *file, struct poll_table_struct *table) +{ + if (!imonqtest()) { + ino_ev_q.q_wanted = 1; + poll_wait(file, &imon_wait, table); + return 0; + } + return POLLIN | POLLRDNORM; +} + +/* + * imon_open + * + * Description: + * Open the imon device (typically /dev/imon). Only one process + * at a time can have imon open. + * + * Parameters: + * inode imon device inode + * file imon device file + * + * Returns: + * 0 if successful, -errno if error. + */ +static int +imon_open(struct inode *inode, struct file *file) +{ + int error; + + down(&opensema); + /* multiple clients are not supported */ + if (imon_enabled) { + up(&opensema); + return -EBUSY; + } + + if ((error = initqueue()) < 0) { + up(&opensema); + return error; + } +#if DISABLE_EVENTS + printk(KERN_DEBUG "NOT turning on imon_enabled\n"); +#else + /* or should this be delayed until the first express? */ + imon_event = __imon_event; + imon_broadcast = __imon_broadcast; + imon_enabled = 1; +#endif + MOD_INC_USE_COUNT; + up(&opensema); + return 0; +} + +/* + * imon_release + * + * Description: + * Called when the imon device is closed. + * + * Parameters: + * inode imon device inode + * file imon device file + * + * Returns: + * 0 if successful, -errno if error. + */ +static int +imon_release(struct inode *inode, struct file *file) +{ + down(&opensema); + imon_enabled = 0; + imon_event = NULL; + imon_broadcast = NULL; + + /* + * Shutdown hash table. Calls may still be made to + * probehash() after the inode monitor enabled flag is turned + * off since the flag is not protected. Calling clearhash() + * will effectively disable probehash() so that no further + * notifications will occur. + */ + clearhash(); + + freequeue(); + + MOD_DEC_USE_COUNT; + up(&opensema); + return 0; +} + +/*----------------------------queue operators----------------------------*/ +/* + * The queue is implemented as a circular list of events. New events are + * matched against the last QBACKSTEPS events and if the dev/inode are the + * same, then the two events are or'd together. After an event is entered + * into the queue, a timeout of imon_qlag ticks is set to wakeup the reader. + * If the queue fills beyond q_thresh elements, the timeout is canceled and + * the reader is woken immediately. + */ + +/* Queue size (somewhat less than two 4K pages worth) */ +static int imon_qsize = 2 * 4000 / sizeof(qelem_t); + +/* ticks to wait before waking up user */ +static int imon_qlag = 20; + +/* how far back to look for event compress */ +static int imon_qbacksteps = 10; + +MODULE_PARM(imon_qsize, "i"); +MODULE_PARM(imon_qlag, "i"); +MODULE_PARM(imon_qbacksteps, "i"); + +/* + * dequeuewakeup + * + * Description: + * Wake up a process waiting for imon IO. + */ +/*ARGSUSED*/ +static void +dequeuewakeup(unsigned long arg) +{ + wake_up_interruptible(&imon_wait); + ino_ev_q.q_tpending = 0; + ino_ev_q.q_wanted = 0; +} + +/* + * enqueue + * + * Description: + * Enqueue an imon event. + * + * Parameters: + * dev device of event. + * ino inode of event. + * cause cause of event. + */ +static void +enqueue(dev_t dev, ino_t ino, intmask_t cause) +{ + /* + * We may get here even when imon_enabled is 0 because of a + * race between this function and imon_release(). + */ + if (imon_enabled == 0) { + return; /* imon_release'ed */ + } + + spin_lock(&qlock); + + /* Make sure queue is still active once inside of qlock */ + if (ino_ev_q.q_base == 0) { + goto out; + } + + /* last enqueue filled last slot in queue */ + if (ino_ev_q.q_over) { + goto out; + } + + /* Check if this event is happening to an inode that already has + * an outstanding event and if so reduce them to one event. + * + * There's a potential race condition here: If a program + * execs-exits-execs, then the user program won't be able to + * tell the current state. Therefore, these events turn off + * their counterparts so the event the user sees reflects the + * current state. + */ + if (ino_ev_q.q_tail != ino_ev_q.q_head) { + qelem_t *qe = ino_ev_q.q_tail; + int steps = 0; + do { + if (qe == ino_ev_q.q_base) + qe = ino_ev_q.q_last; + else + --qe; + if (qe->qe_inode == ino && qe->qe_dev == dev) { + if (cause == IMON_EXEC) { + qe->qe_what &= ~IMON_EXIT; + } else if (cause == IMON_EXIT) { + qe->qe_what &= ~IMON_EXEC; + } + qe->qe_what |= cause; + goto out; + } + } while (qe != ino_ev_q.q_head && steps++ < imon_qbacksteps); + } + + /* Put data in q slot */ + ino_ev_q.q_tail->qe_inode = ino; + ino_ev_q.q_tail->qe_dev = dev; + ino_ev_q.q_tail->qe_what = cause; + + /* Bump pointer to next available slot, wrap if we're at the limit. */ + if (ino_ev_q.q_tail == ino_ev_q.q_last) + ino_ev_q.q_tail = ino_ev_q.q_base; + else + ino_ev_q.q_tail++; + + ++ino_ev_q.q_count; + + /* See if we've caught up with our tail. */ + if (ino_ev_q.q_tail == ino_ev_q.q_head) { + ino_ev_q.q_over = 1; + ASSERT(ino_ev_q.q_tpending == 0); /*should have passed qthresh*/ + printk(KERN_WARNING "/dev/imon: event queue overflow\n"); + } else if (ino_ev_q.q_count == 1 && !ino_ev_q.q_tpending + && ino_ev_q.q_wanted) { + /* on the first event, start a timeout to wake up user */ + ino_ev_q.q_tpending = 1; + init_timer(&ino_ev_q.q_timer); + ino_ev_q.q_timer.expires = imon_qlag; + ino_ev_q.q_timer.data = 0; + ino_ev_q.q_timer.function = dequeuewakeup; + add_timer(&ino_ev_q.q_timer); + } else { + unsigned long flags; + save_flags(flags); + cli(); + /* + * on subsequent events, check for a rapidly filling queue + * and wakeup user early if so + */ + if (ino_ev_q.q_count>ino_ev_q.q_thresh && ino_ev_q.q_tpending) { + del_timer(&ino_ev_q.q_timer); + ino_ev_q.q_tpending = 0; + dequeuewakeup(0); + } + restore_flags(flags); + } + +out: + spin_unlock(&qlock); +} + +/* + * static int + * imonqtest(void) + * + * Description: + * Test to see if there are any imon events pending. + * + * Returns: + * 1 if events are pending, 0 otherwise. + */ +static int +imonqtest(void) +{ + return ino_ev_q.q_count ? 1 : 0; +} + +/* + * dequeue + * + * Description: + * Get an element off the queue. + * + * Parameters: + * sleepok if 1, sleep until an event occurs if the queue is + * empty. + * qe Gets the event. + * + * Returns: + * 0 if successful, -errno otherwise. + */ +static int +dequeue(int sleepok, qelem_t *qe) +{ + static qelem_t oe = { 0, 0, IMON_OVER }; /* Overflow token */ + + spin_lock(&qlock); + + /* If we've overflowed, then throw out everything and start over. */ + if (ino_ev_q.q_over) { + ino_ev_q.q_over = 0; + ino_ev_q.q_count = 0; + spin_unlock(&qlock); + *qe = oe; + return 0; + } + + while (ino_ev_q.q_tail == ino_ev_q.q_head) { /* queue is empty */ + if (!sleepok) { + spin_unlock(&qlock); + return -EAGAIN; + } + ino_ev_q.q_wanted = 1; + spin_unlock(&qlock); + if (signal_pending(current)) { + return -ERESTARTSYS; + } + interruptible_sleep_on(&imon_wait); + spin_lock(&qlock); + } + + *qe = *ino_ev_q.q_head; + --ino_ev_q.q_count; + + if (ino_ev_q.q_head == ino_ev_q.q_last) + ino_ev_q.q_head = ino_ev_q.q_base; + else + ino_ev_q.q_head++; + + spin_unlock(&qlock); + return 0; +} + +/* + * initqueue + * + * Description: + * Initialize event queue. + * + * Returns: + * 0 if successful, -errno otherwise. + */ +static int +initqueue(void) +{ + qelem_t *base; + int qsize = imon_qsize; + + base = kmalloc(qsize * sizeof(qelem_t), GFP_KERNEL); + if (base == NULL) { + return -ENOMEM; + } + ino_ev_q.q_head = ino_ev_q.q_tail = ino_ev_q.q_base = base; + ino_ev_q.q_last = base + qsize - 1; + ino_ev_q.q_thresh = qsize / 4; + ino_ev_q.q_count = 0; + return 0; +} + +/* + * freequeue + * + * Description: + * Drain and delete event queue. + */ +static void +freequeue(void) +{ + void *base = 0; + + spin_lock(&qlock); + + ino_ev_q.q_tail = ino_ev_q.q_head = 0; + + if (ino_ev_q.q_base) { + base = ino_ev_q.q_base; + ino_ev_q.q_base = 0; + } + + if (ino_ev_q.q_tpending) { + del_timer(&ino_ev_q.q_timer); + ino_ev_q.q_tpending = 0; + } + + spin_unlock(&qlock); + + if (base) { + kfree(base); + } +} + +/*----------------------------hash operators-----------------------------*/ + +#define ACTIVE(x) ((x)->qe_what) + +/* If an element is added to a table which already contains UPPER_ALPHA(n) + * elements, it will grow. UPPER_ALPHA is 1/2 of 2^n. + */ +#define UPPER_ALPHA(n) (1 << ((n) - 1)) +/* If an element is removed from a table which will already allow + * LOWER_ALPHA(n) elements to be added before growing, it will shrink. + * LOWER_ALPHA is 3/4 of UPPER_ALPHA. + */ +#define LOWER_ALPHA(n) ((1 << ((n) - 1)) - (1 << (n - 3))) + +/* + * Create a new hash table of size 2^shift entries. + */ +static int +ihnew(inthash_t *ih, u_short shift) +{ + int count = 1 << shift; + qelem_t *qe; +#if METER_ON + meter_t empty_meters = { 0L, }; +#endif + /* + * Waste a portion of the entries to get a good alpha. + */ + ih->ih_numfree = UPPER_ALPHA(shift); + ih->ih_shift = shift; + ih->ih_base = kmalloc(count * sizeof(qelem_t), GFP_KERNEL); + if (ih->ih_base == NULL) { + return -ENOMEM; + } + ih->ih_limit = ih->ih_base + count; + + /* kmalloc doesn't zero-out pages? */ + for (qe = ih->ih_base; qe < ih->ih_limit; ++qe) { + qe->qe_what = 0; + } + +#if METER_ON + /* clear the meters */ + ih->ih_meters = empty_meters; +#endif + return 0; +} + +/* + * Multiplicative hash with linear probe. + */ +static int imon_hashhisize = 17; /* max size of hash is (1<qe_dev ^ (ik)->qe_inode))) >> (32 - (ih)->ih_shift)) + +/* + * Return a pointer to the entry matching ik if it exists, otherwise + * to the empty entry in which ik should be installed. + */ +static qelem_t * +ihlookup(inthash_t *ih, qelem_t *ik, int adding) +{ + qelem_t *ihe; + + METER(ih->ih_meters.im_lookups++); + ihe = &ih->ih_base[IHHASH(ih, ik)]; + while (ACTIVE(ihe) && + (ihe->qe_inode != ik->qe_inode || ihe->qe_dev != ik->qe_dev)) { + /* + * Because ih_numfree was initialized to less than the full + * size of the hash table size, we need not check for table + * fullness. We're guaranteed to hit an empty (!ACTIVE(ihe)) + * entry before revisiting the primary hash. + */ + METER(ih->ih_meters.im_probes++); + if (adding) + ihe->qe_what |= IMON_COLLISION; + if (ihe == ih->ih_base) + ihe = ih->ih_limit; + --ihe; + } + + return ihe; +} + +/* + * Add a new key/value pair to the inode hash. + * If the hash overflows, and its size is below its maximum, grow the + * hash by doubling its size and re-inserting all of its elements. + */ +static int +ihadd(inthash_t *ih, qelem_t *ik) +{ + qelem_t *ihe; + int error; + + ASSERT(ACTIVE(ik)); + if (ih->ih_numfree == 0) { + inthash_t tih; + + if (ih->ih_shift == imon_hashhisize) + return -ENOMEM; + /* + * Hash table is full, so double its size and re-insert all + * active entries plus the new one, ik. + */ + error = ihnew(&tih, ih->ih_shift + 1); + if (error) { + return error; + } + METER(ih->ih_meters.im_grows++); + for (ihe = ih->ih_base; ihe < ih->ih_limit; ihe++) + if (ACTIVE(ihe)) { + /* ihadd can't fail here, because + * we just allocated tih to be big + * enough. + */ + ihe->qe_what &= ~IMON_COLLISION; + (void) ihadd(&tih,ihe); + } +#if METER_ON + tih.ih_meters = ih->ih_meters; /* copy the meters */ +#endif + kfree(ih->ih_base); + *ih = tih; + } + + ihe = ihlookup(ih, ik, 1); + if (ACTIVE(ihe)) + ihe->qe_what |= (ik->qe_what & IMON_USERMASK); + else { + METER(ih->ih_meters.im_adds++); + ASSERT(ih->ih_numfree > 0); + --ih->ih_numfree; + *ihe = *ik; + } + return 0; +} + +/* + * Remove an interest from the hash and shrink the hash table if necessary. + */ +static void +ihdelete(inthash_t *ih, qelem_t *ik) +{ + qelem_t *ihe; + int error; + + ihe = ihlookup(ih, ik, 0); + if (!ACTIVE(ihe)) + return; + ihe->qe_what &= ~ik->qe_what; + if ((ihe->qe_what & ~IMON_COLLISION) != 0) + return; + METER(ih->ih_meters.im_deletes++); + ASSERT(ih->ih_numfree < UPPER_ALPHA(ih->ih_shift)); + ih->ih_numfree++; + + if (ih->ih_shift > imon_hashlosize && + ih->ih_numfree > LOWER_ALPHA(ih->ih_shift)) { + inthash_t tih; + + /* + * The hash table is empty enough that we'll shrink it. + */ + error = ihnew(&tih, ih->ih_shift - 1); + if (!error) { + ihe->qe_what = 0; + METER(ih->ih_meters.im_shrinks++); + for (ihe = ih->ih_base; ihe < ih->ih_limit; ihe++) + if (ACTIVE(ihe)) { + /* ihadd can't fail here because tih + * is big enough to hold all of the + * entries we're moving into it. + */ + ihe->qe_what &= ~IMON_COLLISION; + (void) ihadd(&tih, ihe); + } +#if METER_ON + tih.ih_meters = ih->ih_meters; /* copy the meters */ +#endif + kfree(ih->ih_base); + *ih = tih; + } + } else if (ihe->qe_what & IMON_COLLISION) { + int slot, slot2, hash; + qelem_t *ihe2; + + /* + * If not shrinking, reorganize any colliding entries. + */ + /* XXX This logic is bogus. There are at least 2 cases where + ** this gives you incorrect results. See imon.txt. + */ + slot = slot2 = ihe - ih->ih_base; + for (;;) { + ihe->qe_what = 0; + do { + if (--slot2 < 0) + slot2 += 1 << ih->ih_shift; + ihe2 = &ih->ih_base[slot2]; + if (!ACTIVE(ihe2)) + return; + METER(ih->ih_meters.im_dprobes++); + hash = IHHASH(ih, ihe2); + } while ((slot2 <= hash && hash < slot) + || (slot < slot2 && (hash < slot + || slot2 <= hash))); + METER(ih->ih_meters.im_dmoves++); + *ihe = *ihe2; + ihe = ihe2; + slot = slot2; + } + } +} + +/* + * hashinsert + * + * Description: + * Insert a new interest into our hash table. + * + * Parameters: + * dev device of interest. + * inum inode of interest. + * what event mask. + * + * Returns: + * 0 if successful, -errno if error. + */ +static int +hashinsert(dev_t dev, ino_t inum, intmask_t what) +{ + qelem_t ik; + int error = 0; + static time_t nextprint, ntickstowait; + + down(&hashsema); + + if (!imon_htable.ih_base) { + error = ihnew(&imon_htable, imon_hashlosize); + } + + if (!error) { + ik.qe_dev = dev; + ik.qe_inode = inum; + ik.qe_what = what; + + error = ihadd(&imon_htable,&ik); + } + + if (error == -ENOMEM && jiffies >= nextprint) { + /* + * Prevent printks from clobbering performance. + * A client may continue to express interest after + * a hash table overflow. + */ + printk(KERN_WARNING + "/dev/imon: hash table overflow\n"); + if (jiffies > nextprint + ntickstowait) + ntickstowait = 100; /* (re)start */ + else if (ntickstowait < 6400) /* ~ 1 minute */ + ntickstowait *= 2; + nextprint = jiffies + ntickstowait; + } + + up(&hashsema); + return error; +} + +/* + * hashremove + * + * Description: + * Remove an interest from our hash table. + * + * Parameters: + * dev device of interest to remove. + * inum inode of interest to remove. + * what event mask of interest to remove. + */ +static void +hashremove(dev_t dev, ino_t inum, intmask_t what) +{ + qelem_t ik; + + down(&hashsema); + + if (!imon_htable.ih_base) { + up(&hashsema); + return; + } + + ik.qe_dev = dev; + ik.qe_inode = inum; + ik.qe_what = what; + + ihdelete(&imon_htable, &ik); + + up(&hashsema); +} + +/* + * Check if a given dev/inode pair is in the interest hash table. + * Returns interest mask if in table, 0 otherwise. + */ +static intmask_t +probehash(dev_t dev, ino_t inum) +{ + qelem_t ik; + intmask_t what; + + down(&hashsema); + + /* check if hash table is not allocated yet */ + if (imon_htable.ih_base == NULL) { + up(&hashsema); + return 0; + } + + ik.qe_dev = dev; + ik.qe_inode = inum; + + what = ihlookup(&imon_htable,&ik,0)->qe_what & IMON_EVENTMASK; + + METER(what ? imon_htable.ih_meters.im_hits++ + : imon_htable.ih_meters.im_misses++); + + up(&hashsema); + return what; +} + +/* + * clearhash + * + * Description: + * Empty the hash table. + */ +static void +clearhash(void) +{ + void *base; + + if (imon_htable.ih_base == 0) + return; + down(&hashsema); + base = imon_htable.ih_base; + imon_htable.ih_base = 0; + up(&hashsema); + kfree(base); +} + +/* + * __imon_event + * + * Description: + * Called by filesystem code when something that imon might care + * about happens to an inode. + * + * Parameters: + * inode inode that an event occured on. + * event mask of event that occurred. + */ +static void __imon_event(struct inode *inode, int event) +{ + ino_t ino = inode->i_ino; + dev_t dev = inode->i_dev; + if (probehash(dev, ino)) { + enqueue(dev, ino, event); + } +} + +/* + * Broadcast event to all active nodes on a device. + */ +static void __imon_broadcast(dev_t dev, int event) +{ + qelem_t *qe; + + /* + * Broadcast IMON_DELETE for all interests on this device when + * device is unmounted. + */ + if (event == IMON_UNMOUNT) { + event = IMON_DELETE; + } + + /* check if hash table is not allocated yet */ + down(&hashsema); + if (imon_htable.ih_base == NULL) { + up(&hashsema); + return; + } + + /* run through table and check each element */ + for (qe = imon_htable.ih_base; qe < imon_htable.ih_limit; qe++) { + if (ACTIVE(qe) && (dev == qe->qe_dev)) { + enqueue(dev, qe->qe_inode, event); + } + } + + up(&hashsema); +} + + +module_init(init_imon) + + +#if DEBUG_INTO_PROC +/* + * This generates debugging info for /proc/fs/imon-hash. + */ +static int +imon_hash_proc_info(char *bf, char **start, off_t offset, int len) +{ + int wr = 0; + qelem_t *qe; + int i = 0; + int wm = len - 80; /* hopefully none of the lines below will be longer +than that */ + + down(&hashsema); + + wr += sprintf(bf + wr, "shift:\t%u\n", imon_htable.ih_shift); + wr += sprintf(bf + wr, "free:\t%u\n", imon_htable.ih_numfree); + wr += sprintf(bf + wr, "base:\t%lx\n", (long)(imon_htable.ih_base)); + + if (!imon_htable.ih_base) { + up(&hashsema); + return wr; + } + + wr += sprintf(bf + wr, "entries:\n"); + for (i = 0; i < imon_htable.ih_limit - imon_htable.ih_base; ++i) { + qe = imon_htable.ih_base + i; + if (!(ACTIVE(qe))) continue; + wr += sprintf(bf + wr, "%d: %ld/%ld", i, qe->qe_dev, qe->qe_inode); + if (qe->qe_what & IMON_COLLISION) wr += sprintf(bf + wr, " *%d", + IHHASH(&imon_htable, qe)); + wr += sprintf(bf + wr, "\n"); + + if (wr > wm) { + wr += sprintf(bf + wr, "(limit hit!)\n"); + up(&hashsema); + return wr; + } + } + + wr += sprintf(bf + wr, "OK!\n"); + up(&hashsema); + return wr; +} + +/* + * This generates debugging info for /proc/fs/imon-meter. + */ +static int +imon_meter_proc_info(char *bf, char **start, off_t offset, int len) +{ + int wr = 0; + +#if METER_ON + int shift, numfree; + meter_t m; + + down(&hashsema); + shift = imon_htable.ih_shift; + numfree = imon_htable.ih_numfree; + m = imon_htable.ih_meters; + up(&hashsema); + + if (shift == 0) { + wr += sprintf(bf + wr, "no hash at the moment.\n"); + return wr; + } + + wr += sprintf(bf + wr, "min shift:\t%d (%d entries)\n", imon_hashlosize, (1 << imon_hashlosize)); + wr += sprintf(bf + wr, "max shift:\t%d (%d entries)\n", imon_hashhisize, (1 << imon_hashhisize)); + wr += sprintf(bf + wr, "shift:\t\t%d (%d entries)\n", shift, (1 << shift)); + wr += sprintf(bf + wr, "upper alpha:\t%d/%d\n", UPPER_ALPHA(shift), (1 << shift)); + wr += sprintf(bf + wr, "lower alpha:\t%d/%d\n", UPPER_ALPHA(shift) - LOWER_ALPHA(shift), (1 << shift)); +/* wr += sprintf(bf + wr, "interests:\t%ld in %ld entries\n", m.m_ints + m.m_ents, m.m_ents); */ + + wr += sprintf(bf + wr, "current table:\n"); + wr += sprintf(bf + wr, "\tlookups:\t%lu\n", m.im_lookups); + wr += sprintf(bf + wr, "\tprobes:\t\t%lu\n", m.im_probes); + wr += sprintf(bf + wr, "\thits:\t\t%lu\n", m.im_hits); + wr += sprintf(bf + wr, "\tmisses:\t\t%lu\n", m.im_misses); + wr += sprintf(bf + wr, "\tadds:\t\t%lu\n", m.im_adds); + wr += sprintf(bf + wr, "\tgrows:\t\t%lu\n", m.im_grows); + wr += sprintf(bf + wr, "\tshrinks:\t%lu\n", m.im_shrinks); + wr += sprintf(bf + wr, "\tdeletes:\t%lu\n", m.im_deletes); + wr += sprintf(bf + wr, "\tdprobes:\t%lu\n", m.im_dprobes); + wr += sprintf(bf + wr, "\tdmoves:\t\t%lu\n", m.im_dmoves); +#else + wr += sprintf(bf + wr, "no metering data; rebuild imon.o with METER_ON s +et.\n"); +#endif /* METER_ON */ + + return wr; +} + +#endif /* DEBUG_INTO_PROC */ diff -Nuar linux-2.4.17.SuSE/fs/imon/imon_static.c linux-2.4.17.SuSE.imon/fs/imon/imon_static.c --- linux-2.4.17.SuSE/fs/imon/imon_static.c Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/fs/imon/imon_static.c Sun Feb 3 20:19:17 2002 @@ -0,0 +1,39 @@ +/* + imon_static - static symbols for imon + Copyright (C) 1999 Silicon Graphics, Inc. All Rights Reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of version 2 of the GNU General Public License as + published by the Free Software Foundation. + + This program is distributed in the hope that it would be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Further, any + license provided herein, whether implied or otherwise, is limited to + this program in accordance with the express provisions of the GNU + General Public License. Patent licenses, if any, provided herein do not + apply to combinations of this program with other product or programs, or + any other product whatsoever. This program is distributed without any + warranty that the program is delivered free of the rightful claim of any + third person by way of infringement or the like. See the GNU General + Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston MA 02111-1307, USA. +*/ + +/* + * If a kernel is configured for imon support, there are a few symbols + * which are used even when the imon module is never loaded. Hopefully + * these will all go away when imon is replacing the tables of file & + * inode operations. + */ + +#include + +#if defined(CONFIG_IMON) || defined(CONFIG_IMON_MODULE) +void (*imon_event)(struct inode *, int); +void (*imon_broadcast)(dev_t dev, int event); +int imon_enabled; +#endif diff -Nuar linux-2.4.17.SuSE/fs/namei.c linux-2.4.17.SuSE.imon/fs/namei.c --- linux-2.4.17.SuSE/fs/namei.c Wed Jan 23 15:19:25 2002 +++ linux-2.4.17.SuSE.imon/fs/namei.c Sun Feb 3 20:19:17 2002 @@ -19,6 +19,7 @@ #include #include #include +#include /* this can go away when imon is done */ #include #include #include @@ -971,6 +972,7 @@ lock_kernel(); error = dir->i_op->create(dir, dentry, mode); unlock_kernel(); + IMON_EVENT_NOERR(error, dir, IMON_CONTENT); exit_lock: up(&dir->i_zombie); if (!error) @@ -1292,6 +1294,7 @@ default: error = -EINVAL; } + IMON_EVENT_NOERR(error, dentry->d_inode, IMON_CONTENT); dput(dentry); } up(&nd.dentry->d_inode->i_sem); @@ -1320,6 +1323,7 @@ lock_kernel(); error = dir->i_op->mkdir(dir, dentry, mode); unlock_kernel(); + IMON_EVENT_NOERR(error, dir, IMON_CONTENT); exit_lock: up(&dir->i_zombie); @@ -1410,6 +1414,8 @@ lock_kernel(); error = dir->i_op->rmdir(dir, dentry); unlock_kernel(); + IMON_EVENT_NOERR(error, dir, IMON_CONTENT); + IMON_EVENT_NOERR(error, dentry->d_inode, IMON_DELETE); if (!error) dentry->d_inode->i_flags |= S_DEAD; } @@ -1481,6 +1487,8 @@ lock_kernel(); error = dir->i_op->unlink(dir, dentry); unlock_kernel(); + IMON_EVENT_NOERR(error, dir, IMON_CONTENT); + IMON_EVENT_NOERR(error, dentry->d_inode, IMON_DELETE); if (!error) d_delete(dentry); } @@ -1552,6 +1560,7 @@ lock_kernel(); error = dir->i_op->symlink(dir, dentry, oldname); unlock_kernel(); + IMON_EVENT_NOERR(error, dir, IMON_CONTENT); exit_lock: up(&dir->i_zombie); @@ -1626,6 +1635,7 @@ lock_kernel(); error = dir->i_op->link(old_dentry, dir, new_dentry); unlock_kernel(); + IMON_EVENT_NOERR(error, dir, IMON_CONTENT); exit_lock: up(&dir->i_zombie); @@ -1789,8 +1799,12 @@ double_up(&old_dir->i_zombie, &new_dir->i_zombie); - if (!error) + if (!error) { d_move(old_dentry,new_dentry); + IMON_EVENT(old_dir, IMON_CONTENT); + IMON_EVENT(new_dir, IMON_CONTENT); + IMON_EVENT(old_dentry->d_inode, IMON_DELETE); + } out_unlock: up(&old_dir->i_sb->s_vfs_rename_sem); return error; @@ -1831,6 +1845,10 @@ double_up(&old_dir->i_zombie, &new_dir->i_zombie); if (error) return error; + IMON_EVENT(old_dir, IMON_CONTENT); + IMON_EVENT(new_dir, IMON_CONTENT); + IMON_EVENT(old_dentry->d_inode, IMON_DELETE); + IMON_EVENT(new_dentry->d_inode, IMON_CONTENT); /* The following d_move() should become unconditional */ if (!(old_dir->i_sb->s_type->fs_flags & FS_ODD_RENAME)) { d_move(old_dentry, new_dentry); diff -Nuar linux-2.4.17.SuSE/fs/read_write.c linux-2.4.17.SuSE.imon/fs/read_write.c --- linux-2.4.17.SuSE/fs/read_write.c Sun Aug 5 22:12:41 2001 +++ linux-2.4.17.SuSE.imon/fs/read_write.c Sun Feb 3 20:19:17 2002 @@ -10,6 +10,7 @@ #include #include #include +#include /* this can go away when imon is done */ #include #include @@ -185,8 +186,10 @@ if (!ret) { ssize_t (*write)(struct file *, const char *, size_t, loff_t *); ret = -EINVAL; - if (file->f_op && (write = file->f_op->write) != NULL) + if (file->f_op && (write = file->f_op->write) != NULL) { ret = write(file, buf, count, &file->f_pos); + IMON_EVENT(inode, IMON_CONTENT); + } } } if (ret > 0) @@ -332,8 +335,10 @@ if (!file) goto bad_file; if (file->f_op && (file->f_mode & FMODE_WRITE) && - (file->f_op->writev || file->f_op->write)) + (file->f_op->writev || file->f_op->write)) { ret = do_readv_writev(VERIFY_READ, file, vector, count); + IMON_EVENT(file->f_dentry->d_inode, IMON_CONTENT); + } fput(file); bad_file: diff -Nuar linux-2.4.17.SuSE/include/linux/fs.h linux-2.4.17.SuSE.imon/include/linux/fs.h --- linux-2.4.17.SuSE/include/linux/fs.h Wed Jan 23 15:20:08 2002 +++ linux-2.4.17.SuSE.imon/include/linux/fs.h Sun Feb 3 20:28:44 2002 @@ -439,6 +439,7 @@ unsigned long i_ino; atomic_t i_count; + unsigned int i_execount; kdev_t i_dev; umode_t i_mode; nlink_t i_nlink; diff -Nuar linux-2.4.17.SuSE/include/linux/imon.h linux-2.4.17.SuSE.imon/include/linux/imon.h --- linux-2.4.17.SuSE/include/linux/imon.h Thu Jan 1 01:00:00 1970 +++ linux-2.4.17.SuSE.imon/include/linux/imon.h Sun Feb 3 20:28:48 2002 @@ -0,0 +1,168 @@ +/* + Copyright (C) 1999 Silicon Graphics, Inc. All Rights Reserved. + + This program is free software; you can redistribute it and/or modify it + under the terms of version 2 of the GNU General Public License as + published by the Free Software Foundation. + + This program is distributed in the hope that it would be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Further, any + license provided herein, whether implied or otherwise, is limited to + this program in accordance with the express provisions of the GNU + General Public License. Patent licenses, if any, provided herein do not + apply to combinations of this program with other product or programs, or + any other product whatsoever. This program is distributed without any + warranty that the program is delivered free of the rightful claim of any + third person by way of infringement or the like. See the GNU General + Public License for more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write the Free Software Foundation, Inc., 59 + Temple Place - Suite 330, Boston MA 02111-1307, USA. +*/ + +/* + * imon.h - inode monitor definitions + * + * $Revision: $ + * $Date: $ + * + * Overview + * + * This driver provides a mechanism whereby a user process can monitor + * various system activity on a list of files. The user process + * expresses interest in files by passing the name of each file via an + * ioctl, and then does a read on the device to obtain event records. + * Each event record contains the device and inode number of the + * modified file and a field describing the action that took place. + * + * User Interface + * + * The express-interest ioctl takes three arguments: the pathname to the + * target file, a bitmask indicating which events are of interest, and + * optionally a pointer to a stat buffer in which the current state of + * the file is returned. Multiple expressions on the same or different + * files are permitted. + * + * The revoke-interest ioctl takes three arguments: the device and + * inode of the target file, and the interests to revoke. Once all + * interest flags which have been expressed for a file have been + * revoked, imon stops monitoring that file. + * + * The device is currently implemented as an exclusive open driver. + * + * It is possible to monitor all types of filesystems, although events + * are generated only for local activity on nfs files. + * + * Note: The unsigned long type is used rather than ino_t and dev_t in + * many of the structures below because of lack of agreement between + * the glibc and Linux on what primitive types ino_t and dev_t resolve + * to. glibc does the conversions for system calls like "stat", but + * this is a raw ioctl interface and we'd like to avoid doing such + * conversions in the kernel. + */ + +#ifndef _LINUX_IMON_H +#define _LINUX_IMON_H + +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +typedef unsigned short intmask_t; + +#define IMON_NAME "imon" /* Name of imon driver */ + +/* Interest mask bits */ +#define IMON_CONTENT (1 << 0) /* contents or size have changed */ +#define IMON_ATTRIBUTE (1 << 1) /* mode or ownership have changed */ +#define IMON_DELETE (1 << 2) /* last link has gone away */ +#define IMON_EXEC (1 << 3) /* process executing */ +#define IMON_EXIT (1 << 4) /* last process exited */ +#ifdef __KERNEL__ +#define IMON_UNMOUNT (1 << 14) /* filesystem has been unmounted */ +#define IMON_COLLISION (1 << 15) /* internal hash collision flag */ +#endif + +#define IMON_OVER 0xffff /* queue has overflowed */ +#define IMON_EVENTMASK 0x0fff /* interest bits for all events */ + +/* User-specifiable mask bits */ +#define IMON_USERMASK (IMON_EVENTMASK) + +/* Event queue element */ +typedef struct qelem { + unsigned long qe_inode; /* inode number of file */ + unsigned long qe_dev; /* device of file */ + intmask_t qe_what; /* what events occurred */ +} qelem_t; + +/* Status return buffer for IMONIOC_EXPRESS. This can be used by the + * caller to determine precisely which device and inode are being + * monitored; an IMONIOC_EXPRESS immediately followed by a stat is + * subject to race conditions. We'd like to pass an entire stat + * buffer back to user space, but glibc and Linux have different + * notions of what a stat buffer looks like which makes this very + * inconvenient. + */ +typedef struct famstat { + unsigned long st_dev; /* device of file */ + unsigned long st_ino; /* inode number of file */ +} famstat_t; + +/* Interest structure for express/revoke */ +typedef struct interest { + const char *in_fname; /* pathname */ + famstat_t *in_sb; /* optional status return buffer */ + intmask_t in_what; /* what types of events to send */ +} interest_t; + +/* arg structure for IMONIOC_REVOKDI */ +typedef struct revoke { + unsigned long rv_dev; + unsigned long rv_ino; + intmask_t rv_what; +} revoke_t; + +#define IMONIOC_QTEST _IO('i', 3) +#define IMONIOC_EXPRESS _IOW('i', 4, interest_t) +#define IMONIOC_REVOKE _IOW('i', 5, revoke_t) +/* to use these ioctls, set ALLOW_DEBUG_IOCTLS in imon.c */ +#define IMONIOC_RESET _IO('i', 6) +#define IMONIOC_DISABLE _IO('i', 7) + +#ifdef __KERNEL__ + +#include + +#if defined(CONFIG_IMON) || defined(CONFIG_IMON_MODULE) + +extern void (*imon_event)(struct inode *, int); +extern void (*imon_broadcast)(dev_t dev, int event); +extern int imon_enabled; + +#define IMON_EVENT(ip,ev) if (imon_enabled && ip) { (*imon_event)(ip,ev); } +#define IMON_EVENT_NOERR(err,ip,ev) \ + if (imon_enabled && ip && !err) { (*imon_event)(ip,ev); } +#define IMON_BROADCAST(dev,ev) if (imon_enabled) { (*imon_broadcast)(dev,ev);} + +#else /* defined(CONFIG_IMON) || defined(CONFIG_IMON_MODULE) */ + +#define IMON_EVENT_NOERR(err,ip,ev) +#define IMON_EVENT(ip,ev) +#define IMON_BROADCAST(dev,ev) + +#endif /* defined(CONFIG_IMON) || defined(CONFIG_IMON_MODULE) */ + +#endif /* __KERNEL__ */ + +#ifdef __cplusplus +} +#endif + +#endif /* _LINUX_IMON_H */ diff -Nuar linux-2.4.17.SuSE/include/linux/sched.h linux-2.4.17.SuSE.imon/include/linux/sched.h --- linux-2.4.17.SuSE/include/linux/sched.h Wed Jan 23 15:20:08 2002 +++ linux-2.4.17.SuSE.imon/include/linux/sched.h Sun Feb 3 20:28:44 2002 @@ -408,6 +408,8 @@ struct fs_struct *fs; /* open file information */ struct files_struct *files; +/* For keeping track of whether or not a file is executing */ + struct dentry *script; /* signal handlers */ spinlock_t sigmask_lock; /* Protects signal and blocked */ struct signal_struct *sig; @@ -533,6 +535,7 @@ thread: INIT_THREAD, \ fs: &init_fs, \ files: &init_files, \ + script: NULL, \ sigmask_lock: SPIN_LOCK_UNLOCKED, \ sig: &init_signals, \ pending: { NULL, &tsk.pending.head, {{0}}}, \ diff -Nuar linux-2.4.17.SuSE/kernel/exit.c linux-2.4.17.SuSE.imon/kernel/exit.c --- linux-2.4.17.SuSE/kernel/exit.c Wed Jan 23 15:19:24 2002 +++ linux-2.4.17.SuSE.imon/kernel/exit.c Sun Feb 3 20:19:17 2002 @@ -15,6 +15,7 @@ #ifdef CONFIG_BSD_PROCESS_ACCT #include #endif +#include /* hopefully this can go away when imon is done */ #include #include @@ -445,6 +446,15 @@ panic("Attempted to kill the idle task!"); if (tsk->pid == 1) panic("Attempted to kill init!"); +#ifdef CONFIG_EXECOUNT + if (tsk->script) { + if (--(tsk->script->d_inode->i_execount) == 0) { + IMON_EVENT(tsk->script->d_inode, IMON_EXIT); + } + dput(tsk->script); + tsk->script = NULL; + } +#endif tsk->flags |= PF_EXITING; del_timer_sync(&tsk->real_timer); diff -Nuar linux-2.4.17.SuSE/kernel/fork.c linux-2.4.17.SuSE.imon/kernel/fork.c --- linux-2.4.17.SuSE/kernel/fork.c Wed Jan 23 15:19:24 2002 +++ linux-2.4.17.SuSE.imon/kernel/fork.c Sun Feb 3 20:19:17 2002 @@ -616,6 +616,12 @@ if (p->binfmt && p->binfmt->module) __MOD_INC_USE_COUNT(p->binfmt->module); +#ifdef CONFIG_EXECOUNT + if (p->script) { + p->script = dget(p->script); + p->script->d_inode->i_execount++; + } +#endif p->did_exec = 0; p->swappable = 0; diff -Nuar linux-2.4.17.SuSE/kernel/ksyms.c linux-2.4.17.SuSE.imon/kernel/ksyms.c --- linux-2.4.17.SuSE/kernel/ksyms.c Wed Jan 23 15:19:26 2002 +++ linux-2.4.17.SuSE.imon/kernel/ksyms.c Sun Feb 3 20:26:59 2002 @@ -40,6 +40,7 @@ #include #include #include +#include /* this can go away when imon is done */ #include #include #include @@ -293,6 +294,11 @@ EXPORT_SYMBOL(lock_may_read); EXPORT_SYMBOL(lock_may_write); EXPORT_SYMBOL(dcache_readdir); +#if defined(CONFIG_IMON_MODULE) +EXPORT_SYMBOL(imon_event); +EXPORT_SYMBOL(imon_broadcast); +EXPORT_SYMBOL(imon_enabled); +#endif EXPORT_SYMBOL(buffermem_pages); EXPORT_SYMBOL(nr_free_pages); EXPORT_SYMBOL(page_cache_size);