Categories
SysOps

How to log HDFS operations

Perform HDFS audit logging.

Read log4j.properties on namenode to inspect default HDFS audit logging options.

$ cat /opt/hadoop/hadoop-3.2.2/etc/hadoop/log4j.properties
[...]
#
# hdfs audit logging
#
hdfs.audit.logger=INFO,NullAppender
hdfs.audit.log.maxfilesize=256MB
hdfs.audit.log.maxbackupindex=20
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger}
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize}
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex}
[...]

Edit hadoop-env.sh on namenode to log HDFS operations.

$ cat /opt/hadoop/hadoop-3.2.2/etc/hadoop/hadoop-env.sh
[...]
###
# NameNode specific parameters
###

# Default log level and output location for file system related change
# messages. For non-namenode daemons, the Java property must be set in
# the appropriate _OPTS if one wants something other than INFO,NullAppender
# Java property: hdfs.audit.logger
# export HDFS_AUDIT_LOGGER=INFO,NullAppender

export HDFS_AUDIT_LOGGER=INFO,RFAAUDIT

[...]

Restart namenode service.

$ sudo systemctl restart hadoop-namenode.service

Monitor HDFS operations using hdfs-audit.log file.

$ tail -f /opt/hadoop/hadoop-3.2.2/logs/hdfs-audit.log

hdfs dfs -ls / will create the following log entries:

2021-06-10 21:29:18,746 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=getfileinfo src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:29:18,784 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=listStatus  src=/   dst=null        perm=null       proto=rpc

hdfs dfs -mkdir /tmp will create the following log entries:

2021-06-10 21:29:54,088 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=getfileinfo src=/tmp        dst=null        perm=null       proto=rpc
2021-06-10 21:29:54,107 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=getfileinfo src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:29:54,128 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=mkdirs      src=/tmp        dst=null        perm=hadoop:supergroup:rwxr-xr-x        proto=rpc

hdfs dfs -stat /tmp will create the following log entry:

2021-06-10 21:30:28,618 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=getfileinfo src=/tmp        dst=null        perm=null       proto=rpc

hdfs dfs -rmdir /tmp will create the following log entries:

2021-06-10 21:31:02,527 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=getfileinfo src=/tmp        dst=null        perm=null       proto=rpc
2021-06-10 21:31:02,558 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=listStatus  src=/tmp        dst=null        perm=null       proto=rpc
2021-06-10 21:31:02,563 INFO FSNamesystem.audit: allowed=true   ugi=hadoop (auth:SIMPLE)        ip=/192.168.8.176       cmd=delete      src=/tmp        dst=null        perm=null       proto=rpc

Sample operations over FUSE, so you can see the behavior of shell helpers.

2021-06-10 21:37:16,366 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/.git       dst=null        perm=null       proto=rpc
2021-06-10 21:37:16,370 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:16,370 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:16,373 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=listStatus  src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:18,455 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/c  dst=null        perm=null       proto=rpc
2021-06-10 21:37:18,757 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/m  dst=null        perm=null       proto=rpc
2021-06-10 21:37:18,758 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/m  dst=null        perm=null       proto=rpc
2021-06-10 21:37:18,839 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/mi dst=null        perm=null       proto=rpc
2021-06-10 21:37:18,840 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/mi dst=null        perm=null       proto=rpc
2021-06-10 21:37:18,983 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=listStatus  src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:19,421 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:19,424 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:19,427 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=listStatus  src=/   dst=null        perm=null       proto=rpc
2021-06-10 21:37:19,468 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/.git        dst=null        perm=null       proto=rpc
2021-06-10 21:37:19,469 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/.git       dst=null        perm=null       proto=rpc
2021-06-10 21:37:20,223 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/c   dst=null        perm=null       proto=rpc
2021-06-10 21:37:20,304 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/ca  dst=null        perm=null       proto=rpc
2021-06-10 21:37:20,827 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/r   dst=null        perm=null       proto=rpc
2021-06-10 21:37:20,828 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/r   dst=null        perm=null       proto=rpc
2021-06-10 21:37:20,897 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/re  dst=null        perm=null       proto=rpc
2021-06-10 21:37:20,899 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/re  dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,025 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=listStatus  src=/milosz     dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,029 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/readme.txt  dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,030 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/readme.txt  dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,492 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=open        src=/milosz/readme.txt  dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,537 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz     dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,539 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz     dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,540 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=listStatus  src=/milosz     dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,541 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/milosz/.git        dst=null        perm=null       proto=rpc
2021-06-10 21:37:21,542 INFO FSNamesystem.audit: allowed=true   ugi=milosz (auth:SIMPLE)        ip=/192.168.8.11        cmd=getfileinfo src=/.git       dst=null        perm=null       proto=rpc