Features Download
From: Kees Cook <kees.cook <at> canonical.com>
Subject: [PATCH] fs: block cross-uid sticky symlinks
Newsgroups: gmane.linux.kernel.lsm
Date: Thursday 27th May 2010 20:16:35 UTC (over 8 years ago)
A long-standing class of security issues is the symlink-based
time-of-check-time-of-use race, most commonly seen in world-writable
directories like /tmp. The common method of exploitation of this flaw
is to cross privilege boundaries when following a given symlink (i.e. a
root process follows a symlink belonging to another user).  For a likely
incomplete list of hundreds of examples across the years, please see:

The solution is to permit symlinks to only be followed when outside a
world-writable directory, or when the uid of the symlink and follower
or when the directory owner matches the symlink's owner.

Some pointers to the history of earlier discussion that I could find:

 1996 Aug, Zygo Blaxell
 1996 Oct, Andrew Tridgell
 1997 Dec, Albert D Cahalan
 2005 Feb, Lorenzo Hernández García-Hierro

Past objections and rebuttals could be summarized as:

- Violates POSIX.
  - POSIX didn't consider this situation and it's not useful to follow
    a broken specification at the cost of security.
- Might break unknown applications that use this feature.
   - Applications that break because of the change are easy to spot and
     fix. Applications that are vulnerable to symlink ToCToU by not having
     the change aren't.
- Applications should just use mkstemp() or O_CREATE|O_EXCL.
  - True, but applications are not perfect, and new software is written
    all the time that makes these mistakes; blocking this flaw at the
    kernel is a single solution to the entire class of vulnerability.

This patch is based on the patch in Openwall and grsecurity.  I have
added a sysctl to toggle the behavior back to the old handling via
/proc/sys/fs/weak-sticky-symlinks, documentation, and a ratelimited

Signed-off-by: Kees Cook 
 Documentation/sysctl/kernel.txt |   16 ++++++++++++++++
 include/linux/security.h        |    1 +
 kernel/sysctl.c                 |   10 ++++++++++
 security/capability.c           |    6 ------
 security/commoncap.c            |   25 +++++++++++++++++++++++++
 5 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt
index 3894eaa..6b059f6 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -66,6 +66,7 @@ show up in /proc/sys/kernel:
 - threads-max
 - unknown_nmi_panic
 - version
+- weak-sticky-symlinks
@@ -526,3 +527,18 @@ A small number of systems do generate NMI's for
bizarre random reasons such as
 power management so the default is off. That sysctl works like the
 panic controls already in that directory.
+Following symlinks in world-writable sticky directories (like /tmp) can
+be dangerous due to time-of-check-time-of-use races that frequently result
+in security vulnerabilities.  By default, symlinks can only be followed in
+sticky world-writable directories if the symlink and the follower's uid
+match (or if the symlink is owned by the owner of the world-writable
+The default value is "0".  To disable this protection, setting a value of
+will allow symlinks in sticky world-writable directories to be followed by
diff --git a/include/linux/security.h b/include/linux/security.h
index 0c88191..a06d568 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -67,6 +67,7 @@ extern int cap_inode_setxattr(struct dentry *dentry,
const char *name,
 extern int cap_inode_removexattr(struct dentry *dentry, const char *name);
 extern int cap_inode_need_killpriv(struct dentry *dentry);
 extern int cap_inode_killpriv(struct dentry *dentry);
+extern int cap_inode_follow_link(struct dentry *dentry, struct nameidata
 extern int cap_file_mmap(struct file *file, unsigned long reqprot,
 			 unsigned long prot, unsigned long flags,
 			 unsigned long addr, unsigned long addr_only);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 997080f..bf2d68b 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -88,6 +88,7 @@ extern int sysctl_oom_dump_tasks;
 extern int max_threads;
 extern int core_uses_pid;
 extern int suid_dumpable;
+extern int weak_sticky_symlinks;
 extern char core_pattern[];
 extern unsigned int core_pipe_limit;
 extern int pid_max;
@@ -1463,6 +1464,15 @@ static struct ctl_table fs_table[] = {
 		.extra1		= &zero,
 		.extra2		= &two,
+	{
+		.procname	= "weak-sticky-symlinks",
+		.data		= &weak_sticky_symlinks,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+		.extra2		= &one,
+	},
 		.procname	= "binfmt_misc",
diff --git a/security/capability.c b/security/capability.c
index 8168e3e..ff34291 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -169,12 +169,6 @@ static int cap_inode_readlink(struct dentry *dentry)
 	return 0;
-static int cap_inode_follow_link(struct dentry *dentry,
-				 struct nameidata *nameidata)
-	return 0;
 static int cap_inode_permission(struct inode *inode, int mask)
 	return 0;
diff --git a/security/commoncap.c b/security/commoncap.c
index 4e01599..e7eb397 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -29,6 +29,9 @@
+/* sysctl for symlink permissions checking */
+int weak_sticky_symlinks;
  * If a non-root user executes a setuid-root binary in
  * !secure(SECURE_NOROOT) mode, then we raise capabilities.
@@ -281,6 +284,28 @@ int cap_inode_killpriv(struct dentry *dentry)
 	return inode->i_op->removexattr(dentry, XATTR_NAME_CAPS);
+int cap_inode_follow_link(struct dentry *dentry,
+			  struct nameidata *nameidata)
+	const struct inode *parent = dentry->d_parent->d_inode;
+	const struct inode *inode = dentry->d_inode;
+	const struct cred *cred = current_cred();
+	if (weak_sticky_symlinks)
+		return 0;
+	if (S_ISLNK(inode->i_mode) &&
+	    (parent->i_mode & (S_ISVTX|S_IWOTH)) == (S_ISVTX|S_IWOTH) &&
+	    parent->i_uid != inode->i_uid &&
+	    cred->fsuid != inode->i_uid) {
+		printk_ratelimited(KERN_NOTICE "non-matching-uid symlink "
+			"following attempted in sticky-directory by "
+			"%s (fsuid %d)\n", current->comm, cred->fsuid);
+		return -EACCES;
+	}
+	return 0;
  * Calculate the new process capability sets from the capability sets
  * to a file.

Kees Cook
Ubuntu Security Team
To unsubscribe from this list: send the line "unsubscribe
linux-security-module" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
CD: 3ms