最近由于项目需要在研究spark相关的内容,形成了一些技术性文档,发布这记录下,懒得翻译了。
There are some spaces the official documents didn‘t explain very clearly, especially on some details. Here are given some more explanations base on the practices I did and the source codes I read these days.
(The official document link is http://spark.apache.org/docs/latest/job-scheduling.html)
The FAIR Scheduler is the way corresponding to Hadoop FAIR scheduler and enhancement for FIFO. In FIFO fashion, there is only one factor Priority will be considered in SchedulableQueue; While in FAIR fashion, more factors will be considered including minshare, runningtasks, weight (You can reference the code below if interest).Similarly, the jobs don‘t always run by following the rules by FairSchedulingAlgorithm strictly, while as a whole, the FAIR scheduler really alleviate largely the delay time for small jobs by adjusting the parameters which were delayed significantly in FIFO fashion in my observation through the concurrent JMeter tests。
private[spark] class FIFOSchedulingAlgorithm extends SchedulingAlgorithm { override def comparator(s1: Schedulable, s2: Schedulable): Boolean = { val priority1 = s1.priority val priority2 = s2.priority var res = math.signum(priority1 - priority2) if (res == 0) { val stageId1 = s1.stageId val stageId2 = s2.stageId res = math.signum(stageId1 - stageId2) } if (res < 0) { true } else { false } } }
private[spark] class FairSchedulingAlgorithm extends SchedulingAlgorithm { override def comparator(s1: Schedulable, s2: Schedulable): Boolean = { val minShare1 = s1.minShare val minShare2 = s2.minShare val runningTasks1 = s1.runningTasks val runningTasks2 = s2.runningTasks val s1Needy = runningTasks1 < minShare1 val s2Needy = runningTasks2 < minShare2 val minShareRatio1 = runningTasks1.toDouble / math.max(minShare1, 1.0).toDouble val minShareRatio2 = runningTasks2.toDouble / math.max(minShare2, 1.0).toDouble val taskToWeightRatio1 = runningTasks1.toDouble / s1.weight.toDouble val taskToWeightRatio2 = runningTasks2.toDouble / s2.weight.toDouble var compare: Int = 0 if (s1Needy && !s2Needy) { return true } else if (!s1Needy && s2Needy) { return false } else if (s1Needy && s2Needy) { compare = minShareRatio1.compareTo(minShareRatio2) } else { compare = taskToWeightRatio1.compareTo(taskToWeightRatio2) } if (compare < 0) { true } else if (compare > 0) { false } else { s1.name < s2.name } }
5.The pools in FIFO and FAIR schedulers
原文:http://www.cnblogs.com/taxuexunmei/p/4991250.html