HDFS 权限判断

基础概念

parent path和ancestor path

hdfs权限判断中, 经常需要判断当前路径是否有权限, 父级路径是否有权限. 如果父级路径本身就不存在, 那么需要引入一个概念, 祖先路径(ancestor path), 也就是上一个真实存在的父节点.

比如创建一个文件, 其实是需要父节点的w+x权限的, 具体而言是"ancestor path"的(w+x)权限. 理清楚了这个概念, 处理文件夹和文件的权限的时候, 对概念就非常清晰了.

如果父级节点存在, 那祖先节点其实就等同于父级节点. 如果一路都不存在, 那祖先节点其实是根路径(/).

比如当前路径(/a/b/c)为path, 上一级(/a/b)为parent path, ancestor path为上级节点中真实存在的节点, 如果/a/b存在, 则是/a/b, 否则可能为/a或者/.

查找ancestor path的算法:

    for(; ancestorIndex >= 0 && inodes[ancestorIndex] == null;
        ancestorIndex--);

hdfs代码解析清楚说明了父类路径(parent)和祖先路径(ancestor)的区别.

package org.apache.hadoop.hdfs.server.namenode;

public class FSPermissionChecker implements AccessControlEnforcer {

  /**
   * Check whether current user have permissions to access the path.
   * Traverse is always checked.
   *
   * Parent path means the parent directory for the path.
   * Ancestor path means the last (the closest) existing ancestor directory
   * of the path.
   * Note that if the parent path exists,
   * then the parent path and the ancestor path are the same.
   *
   * For example, suppose the path is "/foo/bar/baz".
   * No matter baz is a file or a directory,
   * the parent path is "/foo/bar".
   * If bar exists, then the ancestor path is also "/foo/bar".
   * If bar does not exist and foo exists,
   * then the ancestor path is "/foo".
   * Further, if both foo and bar do not exist,
   * then the ancestor path is "/".
   *
   * @param doCheckOwner Require user to be the owner of the path?
   * @param ancestorAccess The access required by the ancestor of the path.
   * @param parentAccess The access required by the parent of the path.
   * @param access The access required by the path.
   * @param subAccess If path is a directory,
   * it is the access required of the path and all the sub-directories.
   * If path is not a directory, there is no effect.
   * @param ignoreEmptyDir Ignore permission checking for empty directory?
   * @throws AccessControlException
   * 
   * Guarded by {@link FSNamesystem#readLock()}
   * Caller of this method must hold that lock.
   */
  void checkPermission(INodesInPath inodesInPath, boolean doCheckOwner,
      FsAction ancestorAccess, FsAction parentAccess, FsAction access,
      FsAction subAccess, boolean ignoreEmptyDir)
      throws AccessControlException {
    if (LOG.isDebugEnabled()) {
      LOG.debug("ACCESS CHECK: " + this
          + ", doCheckOwner=" + doCheckOwner
          + ", ancestorAccess=" + ancestorAccess
          + ", parentAccess=" + parentAccess
          + ", access=" + access
          + ", subAccess=" + subAccess
          + ", ignoreEmptyDir=" + ignoreEmptyDir);
    }
    // check if (parentAccess != null) && file exists, then check sb
    // If resolveLink, the check is performed on the link target.
    final int snapshotId = inodesInPath.getPathSnapshotId();
    final INode[] inodes = inodesInPath.getINodesArray();
    final INodeAttributes[] inodeAttrs = new INodeAttributes[inodes.length];
    final byte[][] components = inodesInPath.getPathComponents();
    for (int i = 0; i < inodes.length && inodes[i] != null; i++) {
      inodeAttrs[i] = getINodeAttrs(components, i, inodes[i], snapshotId);
    }

    String path = inodesInPath.getPath();
    int ancestorIndex = inodes.length - 2;

    AccessControlEnforcer enforcer = getAccessControlEnforcer();
    enforcer.checkPermission(fsOwner, supergroup, callerUgi, inodeAttrs, inodes,
        components, snapshotId, path, ancestorIndex, doCheckOwner,
        ancestorAccess, parentAccess, access, subAccess, ignoreEmptyDir);
  }

}

hdfs 权限定义

官方就是用rwx,每个数字都有对应的组合, 比如0是无权限,1是操作权限 x, 2是write, 4是read, 那么3就是wx, 5就是rx, 6就是rw.

然后利用位置进行布尔操作执行权限判断.

package org.apache.hadoop.fs.permission;

import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;

/**
 * File system actions, e.g. read, write, etc.
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public enum FsAction {
  // POSIX style
  NONE("---"),
  EXECUTE("--x"),
  WRITE("-w-"),
  WRITE_EXECUTE("-wx"),
  READ("r--"),
  READ_EXECUTE("r-x"),
  READ_WRITE("rw-"),
  ALL("rwx");

  /** Retain reference to value array. */
  private final static FsAction[] vals = values();

  /** Symbolic representation */
  public final String SYMBOL;

  private FsAction(String s) {
    SYMBOL = s;
  }

  /**
   * Return true if this action implies that action.
   * @param that
   */
  public boolean implies(FsAction that) {
    if (that != null) {
      return (ordinal() & that.ordinal()) == that.ordinal();
    }
    return false;
  }

  /** AND operation. */
  public FsAction and(FsAction that) {
    return vals[ordinal() & that.ordinal()];
  }
  /** OR operation. */
  public FsAction or(FsAction that) {
    return vals[ordinal() | that.ordinal()];
  }
  /** NOT operation. */
  public FsAction not() {
    return vals[7 - ordinal()];
  }

  /**
   * Get the FsAction enum for String representation of permissions
   * 
   * @param permission
   *          3-character string representation of permission. ex: rwx
   * @return Returns FsAction enum if the corresponding FsAction exists for permission.
   *         Otherwise returns null
   */
  public static FsAction getFsAction(String permission) {
    for (FsAction fsAction : vals) {
      if (fsAction.SYMBOL.equals(permission)) {
        return fsAction;
      }
    }
    return null;
  }
}

权限调用实现

权限调用checkPermission

会去计算父级节点组合成参数, 然后调用外部control encoforcer的check permission.

  /*
   * @param doCheckOwner Require user to be the owner of the path?
   * @param ancestorAccess The access required by the ancestor of the path.
   * @param parentAccess The access required by the parent of the path.
   * @param access The access required by the path.
   * @param subAccess If path is a directory,
   * it is the access required of the path and all the sub-directories.
   * If path is not a directory, there is no effect.
   * @param ignoreEmptyDir Ignore permission checking for empty directory?
   * @throws AccessControlException
   * 
   * Guarded by {@link FSNamesystem#readLock()}
   * Caller of this method must hold that lock.
   */
  void checkPermission(INodesInPath inodesInPath, boolean doCheckOwner,
      FsAction ancestorAccess, FsAction parentAccess, FsAction access,
      FsAction subAccess, boolean ignoreEmptyDir)
      throws AccessControlException {
    if (LOG.isDebugEnabled()) {
      LOG.debug("ACCESS CHECK: " + this
          + ", doCheckOwner=" + doCheckOwner
          + ", ancestorAccess=" + ancestorAccess
          + ", parentAccess=" + parentAccess
          + ", access=" + access
          + ", subAccess=" + subAccess
          + ", ignoreEmptyDir=" + ignoreEmptyDir);
    }
    // check if (parentAccess != null) && file exists, then check sb
    // If resolveLink, the check is performed on the link target.
    final int snapshotId = inodesInPath.getPathSnapshotId();
    final INode[] inodes = inodesInPath.getINodesArray();
    final INodeAttributes[] inodeAttrs = new INodeAttributes[inodes.length];
    final byte[][] components = inodesInPath.getPathComponents();
    for (int i = 0; i < inodes.length && inodes[i] != null; i++) {
      inodeAttrs[i] = getINodeAttrs(components, i, inodes[i], snapshotId);
    }

    String path = inodesInPath.getPath();
    int ancestorIndex = inodes.length - 2;

    AccessControlEnforcer enforcer = getAccessControlEnforcer();
    enforcer.checkPermission(fsOwner, supergroup, callerUgi, inodeAttrs, inodes,
        components, snapshotId, path, ancestorIndex, doCheckOwner,
        ancestorAccess, parentAccess, access, subAccess, ignoreEmptyDir);
  }

较为完整的复杂权限的调用链

首先获取有效上级节点, 然后 checkTraverse 判断整条路径的权限 , 然后再分别检查是否有stickybit, 是否是owner之类, 还有具体的check判断.

  @Override
  public void checkPermission(String fsOwner, String supergroup,
      UserGroupInformation callerUgi, INodeAttributes[] inodeAttrs,
      INode[] inodes, byte[][] components, int snapshotId, String path,
      int ancestorIndex, boolean doCheckOwner, FsAction ancestorAccess,
      FsAction parentAccess, FsAction access, FsAction subAccess,
      boolean ignoreEmptyDir)
      throws AccessControlException {
    for(; ancestorIndex >= 0 && inodes[ancestorIndex] == null;
        ancestorIndex--);

    try {
      checkTraverse(inodeAttrs, inodes, components, ancestorIndex);
    } catch (UnresolvedPathException | ParentNotDirectoryException ex) {
      // must tunnel these exceptions out to avoid breaking interface for
      // external enforcer
      throw new TraverseAccessControlException(ex);
    }

    final INodeAttributes last = inodeAttrs[inodeAttrs.length - 1];
    if (parentAccess != null && parentAccess.implies(FsAction.WRITE)
        && inodeAttrs.length > 1 && last != null) {
      checkStickyBit(inodeAttrs, components, inodeAttrs.length - 2);
    }
    if (ancestorAccess != null && inodeAttrs.length > 1) {
      check(inodeAttrs, components, ancestorIndex, ancestorAccess);
    }
    if (parentAccess != null && inodeAttrs.length > 1) {
      check(inodeAttrs, components, inodeAttrs.length - 2, parentAccess);
    }
    if (access != null) {
      check(inodeAttrs, components, inodeAttrs.length - 1, access);
    }
    if (subAccess != null) {
      INode rawLast = inodes[inodeAttrs.length - 1];
      checkSubAccess(components, inodeAttrs.length - 1, rawLast,
          snapshotId, subAccess, ignoreEmptyDir);
    }
    if (doCheckOwner) {
      checkOwner(inodeAttrs, components, inodeAttrs.length - 1);
    }
  }

checkTraverse 会判断每一层是否有`execute` 权限.

checkTraverse判断, 会检查是否文件夹, 然后判断每一层是否有execute权限.

  /** Guarded by {@link FSNamesystem#readLock()}
   * @throws AccessControlException
   * @throws ParentNotDirectoryException
   * @throws UnresolvedPathException
   */
  private void checkTraverse(INodeAttributes[] inodeAttrs, INode[] inodes,
      byte[][] components, int last) throws AccessControlException,
          UnresolvedPathException, ParentNotDirectoryException {
    for (int i=0; i <= last; i++) {
      checkIsDirectory(inodes[i], components, i);
      check(inodeAttrs, components, i, FsAction.EXECUTE);
    }
  }

其他版本的checkTraverse,看起来会使用外部的FSPermissionChecker.

  /**
   * Verifies that all existing ancestors are directories.  If a permission
   * checker is provided then the user must have exec access.  Ancestor
   * symlinks will throw an unresolved exception, and resolveLink determines
   * if the last inode will throw an unresolved exception.  This method
   * should always be called after a path is resolved into an IIP.
   * @param pc for permission checker, null for no checking
   * @param iip path to verify
   * @param resolveLink whether last inode may be a symlink
   * @throws AccessControlException
   * @throws UnresolvedPathException
   * @throws ParentNotDirectoryException
   */
  static void checkTraverse(FSPermissionChecker pc, INodesInPath iip,
      boolean resolveLink) throws AccessControlException,
          UnresolvedPathException, ParentNotDirectoryException {
    try {
      if (pc == null || pc.isSuperUser()) {
        checkSimpleTraverse(iip);
      } else {
        pc.checkPermission(iip, false, null, null, null, null, false);
      }
    } catch (TraverseAccessControlException tace) {
      // unwrap the non-ACE (unresolved, parent not dir) exception
      // tunneled out of checker.
      tace.throwCause();
    }
    // maybe check that the last inode is a symlink
    if (resolveLink) {
      int last = iip.length() - 1;
      checkNotSymlink(iip.getINode(last), iip.getPathComponents(), last);
    }
  }

  // rudimentary permission-less directory check
  private static void checkSimpleTraverse(INodesInPath iip)
      throws UnresolvedPathException, ParentNotDirectoryException {
    byte[][] components = iip.getPathComponents();
    for (int i=0; i < iip.length() - 1; i++) {
      INode inode = iip.getINode(i);
      if (inode == null) {
        break;
      }
      checkIsDirectory(inode, components, i);
    }
  }

check权限判断文件夹的用户-用户组权限属性

  /** Guarded by {@link FSNamesystem#readLock()} */
  private void check(INodeAttributes[] inodes, byte[][] components, int i,
      FsAction access) throws AccessControlException {
    INodeAttributes inode = (i >= 0) ? inodes[i] : null;
    if (inode != null && !hasPermission(inode, access)) {
      throw new AccessControlException(
          toAccessControlString(inode, getPath(components, 0, i), access));
    }
  }

获取权限mode, 然后判断是否有acl权限, 接下来判断是否用户/用户组/其他用户组的权限, rwx的细节原来在这里.

  // return whether access is permitted.  note it neither requires a path or
  // throws so the caller can build the path only if required for an exception.
  // very beneficial for subaccess checks!
  private boolean hasPermission(INodeAttributes inode, FsAction access) {
    if (inode == null) {
      return true;
    }
    final FsPermission mode = inode.getFsPermission();
    final AclFeature aclFeature = inode.getAclFeature();
    if (aclFeature != null && aclFeature.getEntriesSize() > 0) {
      // It's possible that the inode has a default ACL but no access ACL.
      int firstEntry = aclFeature.getEntryAt(0);
      if (AclEntryStatusFormat.getScope(firstEntry) == AclEntryScope.ACCESS) {
        return hasAclPermission(inode, access, mode, aclFeature);
      }
    }
    final FsAction checkAction;
    if (getUser().equals(inode.getUserName())) { //user class
      checkAction = mode.getUserAction();
    } else if (isMemberOfGroup(inode.getGroupName())) { //group class
      checkAction = mode.getGroupAction();
    } else { //other class
      checkAction = mode.getOtherAction();
    }
    return checkAction.implies(access);
  }

外部完整流程

hadoop文件操作权限判断介入的过程

看了源码才知道每个操作hdfs会要求什么权限, 在什么时候检查权限.

比如看delete操作, 在执行代码的过程中, 就会判断是否有write权限, 然后才去执行删除.

checkOperation(OperationCategory.WRITE);

package org.apache.hadoop.hdfs.server.namenode;

@InterfaceAudience.Private
@Metrics(context="dfs")
public class FSNamesystem implements Namesystem, FSNamesystemMBean,
    NameNodeMXBean, ReplicatedBlocksMBean, ECBlockGroupsMBean {
  /**
   * Remove the indicated file from namespace.
   * 
   * @see ClientProtocol#delete(String, boolean) for detailed description and 
   * description of exceptions
   */
  boolean delete(String src, boolean recursive, boolean logRetryCache)
      throws IOException {
    final String operationName = "delete";
    BlocksMapUpdateInfo toRemovedBlocks = null;
    checkOperation(OperationCategory.WRITE);
    final FSPermissionChecker pc = getPermissionChecker();
    writeLock();
    boolean ret = false;
    try {
      checkOperation(OperationCategory.WRITE);
      checkNameNodeSafeMode("Cannot delete " + src);
      toRemovedBlocks = FSDirDeleteOp.delete(
          this, pc, src, recursive, logRetryCache);
      ret = toRemovedBlocks != null;
    } catch (AccessControlException e) {
      logAuditEvent(false, operationName, src);
      throw e;
    } finally {
      writeUnlock(operationName);
    }
    getEditLog().logSync();
    logAuditEvent(true, operationName, src);
    if (toRemovedBlocks != null) {
      removeBlocks(toRemovedBlocks); // Incremental deletion of blocks
    }
    return ret;
  }
}

各种问题

ranger 与hdfs 原生x权限问题

Ranger policies on HDFS (READ/WRITE/EXECUTE)

https://community.cloudera.com/t5/Support-Questions/Ranger-policies-on-HDFS-READ-WRITE-EXECUTE/m-p/214462

原生的hdfs权限管控体系, 需要用户拥有所有上级的x权限, 才能进入子文件夹. 不过开启ranger hdfs管控后, ranger接管了这套体系, ranger代码里也能看到, 只要有某个路径的hdfs权限,就能直接进入了.

created at 2023-08-04

HDFS 权限判断

基础概念​

parent path和ancestor path​

hdfs 权限定义​

权限调用实现​

权限调用checkPermission​

较为完整的复杂权限的调用链​

checkTraverse 会判断每一层是否有execute 权限.​

check权限判断文件夹的用户-用户组权限属性​

外部完整流程​

hadoop文件操作权限判断介入的过程​

各种问题​

ranger 与hdfs 原生x权限问题​