跳到主要内容

HDFS 权限判断

基础概念

parent path和ancestor path

hdfs权限判断中, 经常需要判断当前路径是否有权限, 父级路径是否有权限. 如果父级路径本身就不存在, 那么需要引入一个概念, 祖先路径(ancestor path), 也就是上一个真实存在的父节点.

比如创建一个文件, 其实是需要父节点的w+x权限的, 具体而言是"ancestor path"的(w+x)权限. 理清楚了这个概念, 处理文件夹和文件的权限的时候, 对概念就非常清晰了.

如果父级节点存在, 那祖先节点其实就等同于父级节点. 如果一路都不存在, 那祖先节点其实是根路径(/).

比如当前路径(/a/b/c)为path, 上一级(/a/b)为parent path, ancestor path为上级节点中真实存在的节点, 如果/a/b存在, 则是/a/b, 否则可能为/a或者/.

查找ancestor path的算法:

    for(; ancestorIndex >= 0 && inodes[ancestorIndex] == null;
ancestorIndex--);

hdfs代码解析清楚说明了父类路径(parent)和祖先路径(ancestor)的区别.

package org.apache.hadoop.hdfs.server.namenode;

public class FSPermissionChecker implements AccessControlEnforcer {

/**
* Check whether current user have permissions to access the path.
* Traverse is always checked.
*
* Parent path means the parent directory for the path.
* Ancestor path means the last (the closest) existing ancestor directory
* of the path.
* Note that if the parent path exists,
* then the parent path and the ancestor path are the same.
*
* For example, suppose the path is "/foo/bar/baz".
* No matter baz is a file or a directory,
* the parent path is "/foo/bar".
* If bar exists, then the ancestor path is also "/foo/bar".
* If bar does not exist and foo exists,
* then the ancestor path is "/foo".
* Further, if both foo and bar do not exist,
* then the ancestor path is "/".
*
* @param doCheckOwner Require user to be the owner of the path?
* @param ancestorAccess The access required by the ancestor of the path.
* @param parentAccess The access required by the parent of the path.
* @param access The access required by the path.
* @param subAccess If path is a directory,
* it is the access required of the path and all the sub-directories.
* If path is not a directory, there is no effect.
* @param ignoreEmptyDir Ignore permission checking for empty directory?
* @throws AccessControlException
*
* Guarded by {@link FSNamesystem#readLock()}
* Caller of this method must hold that lock.
*/
void checkPermission(INodesInPath inodesInPath, boolean doCheckOwner,
FsAction ancestorAccess, FsAction parentAccess, FsAction access,
FsAction subAccess, boolean ignoreEmptyDir)
throws AccessControlException {
if (LOG.isDebugEnabled()) {
LOG.debug("ACCESS CHECK: " + this
+ ", doCheckOwner=" + doCheckOwner
+ ", ancestorAccess=" + ancestorAccess
+ ", parentAccess=" + parentAccess
+ ", access=" + access
+ ", subAccess=" + subAccess
+ ", ignoreEmptyDir=" + ignoreEmptyDir);
}
// check if (parentAccess != null) && file exists, then check sb
// If resolveLink, the check is performed on the link target.
final int snapshotId = inodesInPath.getPathSnapshotId();
final INode[] inodes = inodesInPath.getINodesArray();
final INodeAttributes[] inodeAttrs = new INodeAttributes[inodes.length];
final byte[][] components = inodesInPath.getPathComponents();
for (int i = 0; i < inodes.length && inodes[i] != null; i++) {
inodeAttrs[i] = getINodeAttrs(components, i, inodes[i], snapshotId);
}

String path = inodesInPath.getPath();
int ancestorIndex = inodes.length - 2;

AccessControlEnforcer enforcer = getAccessControlEnforcer();
enforcer.checkPermission(fsOwner, supergroup, callerUgi, inodeAttrs, inodes,
components, snapshotId, path, ancestorIndex, doCheckOwner,
ancestorAccess, parentAccess, access, subAccess, ignoreEmptyDir);
}

}

hdfs 权限定义

官方就是用rwx,每个数字都有对应的组合, 比如0是无权限,1是操作权限 x, 2是write, 4是read, 那么3就是wx, 5就是rx, 6就是rw.

然后利用位置进行布尔操作执行权限判断.


package org.apache.hadoop.fs.permission;

import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;

/**
* File system actions, e.g. read, write, etc.
*/
@InterfaceAudience.Public
@InterfaceStability.Stable
public enum FsAction {
// POSIX style
NONE("---"),
EXECUTE("--x"),
WRITE("-w-"),
WRITE_EXECUTE("-wx"),
READ("r--"),
READ_EXECUTE("r-x"),
READ_WRITE("rw-"),
ALL("rwx");

/** Retain reference to value array. */
private final static FsAction[] vals = values();

/** Symbolic representation */
public final String SYMBOL;

private FsAction(String s) {
SYMBOL = s;
}

/**
* Return true if this action implies that action.
* @param that
*/
public boolean implies(FsAction that) {
if (that != null) {
return (ordinal() & that.ordinal()) == that.ordinal();
}
return false;
}

/** AND operation. */
public FsAction and(FsAction that) {
return vals[ordinal() & that.ordinal()];
}
/** OR operation. */
public FsAction or(FsAction that) {
return vals[ordinal() | that.ordinal()];
}
/** NOT operation. */
public FsAction not() {
return vals[7 - ordinal()];
}

/**
* Get the FsAction enum for String representation of permissions
*
* @param permission
* 3-character string representation of permission. ex: rwx
* @return Returns FsAction enum if the corresponding FsAction exists for permission.
* Otherwise returns null
*/
public static FsAction getFsAction(String permission) {
for (FsAction fsAction : vals) {
if (fsAction.SYMBOL.equals(permission)) {
return fsAction;
}
}
return null;
}
}

权限调用实现

权限调用checkPermission

会去计算父级节点组合成参数, 然后调用外部control encoforcer的check permission.

  /*
* @param doCheckOwner Require user to be the owner of the path?
* @param ancestorAccess The access required by the ancestor of the path.
* @param parentAccess The access required by the parent of the path.
* @param access The access required by the path.
* @param subAccess If path is a directory,
* it is the access required of the path and all the sub-directories.
* If path is not a directory, there is no effect.
* @param ignoreEmptyDir Ignore permission checking for empty directory?
* @throws AccessControlException
*
* Guarded by {@link FSNamesystem#readLock()}
* Caller of this method must hold that lock.
*/
void checkPermission(INodesInPath inodesInPath, boolean doCheckOwner,
FsAction ancestorAccess, FsAction parentAccess, FsAction access,
FsAction subAccess, boolean ignoreEmptyDir)
throws AccessControlException {
if (LOG.isDebugEnabled()) {
LOG.debug("ACCESS CHECK: " + this
+ ", doCheckOwner=" + doCheckOwner
+ ", ancestorAccess=" + ancestorAccess
+ ", parentAccess=" + parentAccess
+ ", access=" + access
+ ", subAccess=" + subAccess
+ ", ignoreEmptyDir=" + ignoreEmptyDir);
}
// check if (parentAccess != null) && file exists, then check sb
// If resolveLink, the check is performed on the link target.
final int snapshotId = inodesInPath.getPathSnapshotId();
final INode[] inodes = inodesInPath.getINodesArray();
final INodeAttributes[] inodeAttrs = new INodeAttributes[inodes.length];
final byte[][] components = inodesInPath.getPathComponents();
for (int i = 0; i < inodes.length && inodes[i] != null; i++) {
inodeAttrs[i] = getINodeAttrs(components, i, inodes[i], snapshotId);
}

String path = inodesInPath.getPath();
int ancestorIndex = inodes.length - 2;

AccessControlEnforcer enforcer = getAccessControlEnforcer();
enforcer.checkPermission(fsOwner, supergroup, callerUgi, inodeAttrs, inodes,
components, snapshotId, path, ancestorIndex, doCheckOwner,
ancestorAccess, parentAccess, access, subAccess, ignoreEmptyDir);
}

较为完整的复杂权限的调用链

首先获取有效上级节点, 然后 checkTraverse 判断整条路径的权限 , 然后再分别检查是否有stickybit, 是否是owner之类, 还有具体的check判断.

  @Override
public void checkPermission(String fsOwner, String supergroup,
UserGroupInformation callerUgi, INodeAttributes[] inodeAttrs,
INode[] inodes, byte[][] components, int snapshotId, String path,
int ancestorIndex, boolean doCheckOwner, FsAction ancestorAccess,
FsAction parentAccess, FsAction access, FsAction subAccess,
boolean ignoreEmptyDir)
throws AccessControlException {
for(; ancestorIndex >= 0 && inodes[ancestorIndex] == null;
ancestorIndex--);

try {
checkTraverse(inodeAttrs, inodes, components, ancestorIndex);
} catch (UnresolvedPathException | ParentNotDirectoryException ex) {
// must tunnel these exceptions out to avoid breaking interface for
// external enforcer
throw new TraverseAccessControlException(ex);
}

final INodeAttributes last = inodeAttrs[inodeAttrs.length - 1];
if (parentAccess != null && parentAccess.implies(FsAction.WRITE)
&& inodeAttrs.length > 1 && last != null) {
checkStickyBit(inodeAttrs, components, inodeAttrs.length - 2);
}
if (ancestorAccess != null && inodeAttrs.length > 1) {
check(inodeAttrs, components, ancestorIndex, ancestorAccess);
}
if (parentAccess != null && inodeAttrs.length > 1) {
check(inodeAttrs, components, inodeAttrs.length - 2, parentAccess);
}
if (access != null) {
check(inodeAttrs, components, inodeAttrs.length - 1, access);
}
if (subAccess != null) {
INode rawLast = inodes[inodeAttrs.length - 1];
checkSubAccess(components, inodeAttrs.length - 1, rawLast,
snapshotId, subAccess, ignoreEmptyDir);
}
if (doCheckOwner) {
checkOwner(inodeAttrs, components, inodeAttrs.length - 1);
}
}

checkTraverse 会判断每一层是否有execute 权限.

checkTraverse判断, 会检查是否文件夹, 然后判断每一层是否有execute权限.



/** Guarded by {@link FSNamesystem#readLock()}
* @throws AccessControlException
* @throws ParentNotDirectoryException
* @throws UnresolvedPathException
*/
private void checkTraverse(INodeAttributes[] inodeAttrs, INode[] inodes,
byte[][] components, int last) throws AccessControlException,
UnresolvedPathException, ParentNotDirectoryException {
for (int i=0; i <= last; i++) {
checkIsDirectory(inodes[i], components, i);
check(inodeAttrs, components, i, FsAction.EXECUTE);
}
}

其他版本的checkTraverse,看起来会使用外部的FSPermissionChecker.


/**
* Verifies that all existing ancestors are directories. If a permission
* checker is provided then the user must have exec access. Ancestor
* symlinks will throw an unresolved exception, and resolveLink determines
* if the last inode will throw an unresolved exception. This method
* should always be called after a path is resolved into an IIP.
* @param pc for permission checker, null for no checking
* @param iip path to verify
* @param resolveLink whether last inode may be a symlink
* @throws AccessControlException
* @throws UnresolvedPathException
* @throws ParentNotDirectoryException
*/
static void checkTraverse(FSPermissionChecker pc, INodesInPath iip,
boolean resolveLink) throws AccessControlException,
UnresolvedPathException, ParentNotDirectoryException {
try {
if (pc == null || pc.isSuperUser()) {
checkSimpleTraverse(iip);
} else {
pc.checkPermission(iip, false, null, null, null, null, false);
}
} catch (TraverseAccessControlException tace) {
// unwrap the non-ACE (unresolved, parent not dir) exception
// tunneled out of checker.
tace.throwCause();
}
// maybe check that the last inode is a symlink
if (resolveLink) {
int last = iip.length() - 1;
checkNotSymlink(iip.getINode(last), iip.getPathComponents(), last);
}
}

// rudimentary permission-less directory check
private static void checkSimpleTraverse(INodesInPath iip)
throws UnresolvedPathException, ParentNotDirectoryException {
byte[][] components = iip.getPathComponents();
for (int i=0; i < iip.length() - 1; i++) {
INode inode = iip.getINode(i);
if (inode == null) {
break;
}
checkIsDirectory(inode, components, i);
}
}

check权限判断文件夹的用户-用户组权限属性


/** Guarded by {@link FSNamesystem#readLock()} */
private void check(INodeAttributes[] inodes, byte[][] components, int i,
FsAction access) throws AccessControlException {
INodeAttributes inode = (i >= 0) ? inodes[i] : null;
if (inode != null && !hasPermission(inode, access)) {
throw new AccessControlException(
toAccessControlString(inode, getPath(components, 0, i), access));
}
}

获取权限mode, 然后判断是否有acl权限, 接下来判断是否用户/用户组/其他用户组的权限, rwx的细节原来在这里.



// return whether access is permitted. note it neither requires a path or
// throws so the caller can build the path only if required for an exception.
// very beneficial for subaccess checks!
private boolean hasPermission(INodeAttributes inode, FsAction access) {
if (inode == null) {
return true;
}
final FsPermission mode = inode.getFsPermission();
final AclFeature aclFeature = inode.getAclFeature();
if (aclFeature != null && aclFeature.getEntriesSize() > 0) {
// It's possible that the inode has a default ACL but no access ACL.
int firstEntry = aclFeature.getEntryAt(0);
if (AclEntryStatusFormat.getScope(firstEntry) == AclEntryScope.ACCESS) {
return hasAclPermission(inode, access, mode, aclFeature);
}
}
final FsAction checkAction;
if (getUser().equals(inode.getUserName())) { //user class
checkAction = mode.getUserAction();
} else if (isMemberOfGroup(inode.getGroupName())) { //group class
checkAction = mode.getGroupAction();
} else { //other class
checkAction = mode.getOtherAction();
}
return checkAction.implies(access);
}

外部完整流程

hadoop文件操作权限判断介入的过程

看了源码才知道每个操作hdfs会要求什么权限, 在什么时候检查权限.

比如看delete操作, 在执行代码的过程中, 就会判断是否有write权限, 然后才去执行删除.

checkOperation(OperationCategory.WRITE);

package org.apache.hadoop.hdfs.server.namenode;

@InterfaceAudience.Private
@Metrics(context="dfs")
public class FSNamesystem implements Namesystem, FSNamesystemMBean,
NameNodeMXBean, ReplicatedBlocksMBean, ECBlockGroupsMBean {
/**
* Remove the indicated file from namespace.
*
* @see ClientProtocol#delete(String, boolean) for detailed description and
* description of exceptions
*/
boolean delete(String src, boolean recursive, boolean logRetryCache)
throws IOException {
final String operationName = "delete";
BlocksMapUpdateInfo toRemovedBlocks = null;
checkOperation(OperationCategory.WRITE);
final FSPermissionChecker pc = getPermissionChecker();
writeLock();
boolean ret = false;
try {
checkOperation(OperationCategory.WRITE);
checkNameNodeSafeMode("Cannot delete " + src);
toRemovedBlocks = FSDirDeleteOp.delete(
this, pc, src, recursive, logRetryCache);
ret = toRemovedBlocks != null;
} catch (AccessControlException e) {
logAuditEvent(false, operationName, src);
throw e;
} finally {
writeUnlock(operationName);
}
getEditLog().logSync();
logAuditEvent(true, operationName, src);
if (toRemovedBlocks != null) {
removeBlocks(toRemovedBlocks); // Incremental deletion of blocks
}
return ret;
}
}

各种问题

ranger 与hdfs 原生x权限问题

Ranger policies on HDFS (READ/WRITE/EXECUTE)

https://community.cloudera.com/t5/Support-Questions/Ranger-policies-on-HDFS-READ-WRITE-EXECUTE/m-p/214462

原生的hdfs权限管控体系, 需要用户拥有所有上级的x权限, 才能进入子文件夹. 不过开启ranger hdfs管控后, ranger接管了这套体系, ranger代码里也能看到, 只要有某个路径的hdfs权限,就能直接进入了.

created at 2023-08-04