在运行hadoop 2.2.0过程中,发现${dfs.namenode.checkpoint.edits.dir}目录下的edits_*文件越来越多,故整理一下相关配置。
在hadoop 2.2.0中关于fsimage和edit logs的相关配置有如下几项:
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/namesecondary</value>
<description>Determines where on the local filesystem the DFS secondary
name node should store the temporary images to merge.
If this is a comma-delimited list of directories then the image is
replicated in all of the directories for redundancy.
</description>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>${dfs.namenode.checkpoint.dir}</value>
<description>Determines where on the local filesystem the DFS secondary
name node should store the temporary edits to merge.
If this is a comma-delimited list of directoires then teh edits is
replicated in all of the directoires for redundancy.
Default value is same as dfs.namenode.checkpoint.dir
</description>
</property>
<property>
<name>dfs.namenode.checkpoint.period</name>
<value>3600</value>
<description>The number of seconds between two periodic checkpoints.
</description>
</property>
<property>
<name>dfs.namenode.checkpoint.txns</name>
<value>1000000</value>
<description>The Secondary NameNode or CheckpointNode will create a checkpoint
of the namespace every ‘dfs.namenode.checkpoint.txns‘ transactions, regardless
of whether ‘dfs.namenode.checkpoint.period‘ has expired.
</description>
</property>
<property>
<name>dfs.namenode.checkpoint.check.period</name>
<value>60</value>
<description>The SecondaryNameNode and CheckpointNode will poll the NameNode
every ‘dfs.namenode.checkpoint.check.period‘ seconds to query the number
of uncheckpointed transactions.
</description>
</property>
<property>
<name>dfs.namenode.checkpoint.max-retries</name>
<value>3</value>
<description>The SecondaryNameNode retries failed checkpointing. If the
failure occurs while loading fsimage or replaying edits, the number of
retries is limited by this variable.
</description>
</property>
还有如下两项(即困扰我的),用于设置fsimage和edit log保存的记录数,默认保存2份fsimge和1000000份edits日志信息。
<property> <name>dfs.namenode.num.checkpoints.retained</name> <value>2</value> <description>The number of image checkpoint files that will be retained by the NameNode and Secondary NameNode in their storage directories. All edit logs necessary to recover an up-to-date namespace from the oldest retained checkpoint will also be retained. </description> </property> <property> <name>dfs.namenode.num.extra.edits.retained</name> <value>1000000</value> <description>The number of extra transactions which should be retained beyond what is minimally necessary for a NN restart. This can be useful for audit purposes or for an HA setup where a remote Standby Node may have been offline for some time and need to have a longer backlog of retained edits in order to start again. Typically each edit is on the order of a few hundred bytes, so the default of 1 million edits should be on the order of hundreds of MBs or low GBs. NOTE: Fewer extra edits may be retained than value specified for this setting if doing so would mean that more segments would be retained than the number configured by dfs.namenode.max.extra.edits.segments.retained. </description> </property>
下面一项参考<<hadoop 2.2.0 fsimage和edit logs的处理逻辑>>
<property> <name>dfs.namenode.max.extra.edits.segments.retained</name> <value>10000</value> <description>The maximum number of extra edit log segments which should be retained beyond what is minimally necessary for a NN restart. When used in conjunction with dfs.namenode.num.extra.edits.retained, this configuration property serves to cap the number of extra edits files to a reasonable value. </description> </property>
hadoop 2.2.0 关于 fsimage & edit log 的相关配置,布布扣,bubuko.com
hadoop 2.2.0 关于 fsimage & edit log 的相关配置
原文:http://blog.csdn.net/knowledgeaaa/article/details/23842099