openstack之nova-scheduler详解

int32位 posted @ Oct 10, 2014 10:07:36 AM in openstack , 4856 阅读
转载请注明:http://krystism.is-programmer.com/若有错误,请多多指正,谢谢!

nova-scheduler的功能是负责从多宿主机中调度最适合的宿主机生成云主机。即传入需要启动的云主机列表,nova-scheduler根据云主机的数量、参数等进行调度,选择合适的物理机(hypervisor,宿主机,即运行nova-compute的节点)启动这些云主机。在H版本中实现的调度算法有两个,即过滤(filter)调度算法和随机调度算法(chance)。目前的默认调度算法是 filter-scheduler,即过滤调度器,其思想是先进行一些条件过滤一些宿主机,比如要求可用内存大于2GB,小于2GB的直接过滤,过滤器可以串联多个,即层层过滤。然后对过滤后的宿主机进行权值计算,权值计算就是根据宿主机的状态进行评分(weight),最后根据评分(weight)排序,评分最高的为最佳候选宿主机,评分也是可以串联的,即层层评分。注意openstack的设计原则是可扩展,意味着调度算法、过滤函数、评分函数都是可插除的,用户可以自定义自己的调度器,过滤器,评分方法,而只需在配置文件中配置即可,无需修改核心代码。实现的过滤器很多,而评分函数目前只有内存评分,即根据内存使用量进行评分。

注意:启动云主机时宿主机的位置并不是完全由scheduler控制,用户可以指定availability-zone,aggregate以及通过设置--hint来控制宿主机在某个集合中(本质还是过滤,即通过用户设定的条件进行过滤)。

下面从入口manager开始看看nova-scheduler如何工作的,部分代码:

 def run_instance(self, context, request_spec, admin_password,
            injected_files, requested_networks, is_first_time,
            filter_properties, legacy_bdm_in_spec=True):
        """Tries to call schedule_run_instance on the driver.
        Sets instance vm_state to ERROR on exceptions
        """
        instance_uuids = request_spec['instance_uuids']
        with compute_utils.EventReporter(context, conductor_api.LocalAPI(),
                                         'schedule', *instance_uuids):
            try:
                return self.driver.schedule_run_instance(context,
                        request_spec, admin_password, injected_files,
                        requested_networks, is_first_time, filter_properties,
                        legacy_bdm_in_spec)

            except exception.NoValidHost as ex:
                # don't re-raise
                self._set_vm_state_and_notify('run_instance',
                                              {'vm_state': vm_states.ERROR,
                                              'task_state': None},
                                              context, ex, request_spec)
            except Exception as ex:
                with excutils.save_and_reraise_exception():
                    self._set_vm_state_and_notify('run_instance',
                                                  {'vm_state': vm_states.ERROR,
                                                  'task_state': None},
                                                  context, ex, request_spec)

方法先获取需要创建云主机的uuid,然后直接调用driver的schedule_run_instance,这个driver即调度器,所有的调度器必须继承自driver.Schduler, 并且实现三个抽象方法:schedule_run_instance,select_destinations,select_hosts。driver是由配置文件配置,默认为nova.scheduler.filter_scheduler.FilterScheduler,如下:

scheduler_driver_opt = cfg.StrOpt('scheduler_driver',
        default='nova.scheduler.filter_scheduler.FilterScheduler',
        help='Default driver to use for the scheduler')

CONF = cfg.CONF
CONF.register_opt(scheduler_driver_opt)
class SchedulerManager(manager.Manager):
    """Chooses a host to run instances on."""

    def __init__(self, scheduler_driver=None, *args, **kwargs):
        if not scheduler_driver:
            scheduler_driver = CONF.scheduler_driver
        self.driver = importutils.import_object(scheduler_driver)
        self.compute_rpcapi = compute_rpcapi.ComputeAPI()
        super(SchedulerManager, self).__init__(service_name='scheduler',
                                               *args, **kwargs)
   ### 省略其他代码

由此可见入口函数直接调用driver(调度器)的scheduler_run_instance方法,为了简单起见,下面以chance调度器为例,看看它如何工作。首先查看chance调度器的scheduler_run_instance代码:

def schedule_run_instance(self, context, request_spec,
                              admin_password, injected_files,
                              requested_networks, is_first_time,
                              filter_properties, legacy_bdm_in_spec):
        """Create and run an instance or instances."""
        instance_uuids = request_spec.get('instance_uuids')
        for num, instance_uuid in enumerate(instance_uuids):
            request_spec['instance_properties']['launch_index'] = num
            try:
                host = self._schedule(context, CONF.compute_topic,
                                      request_spec, filter_properties)
                updated_instance = driver.instance_update_db(context,
                        instance_uuid)
                self.compute_rpcapi.run_instance(context,
                        instance=updated_instance, host=host,
                        requested_networks=requested_networks,
                        injected_files=injected_files,
                        admin_password=admin_password,
                        is_first_time=is_first_time,
                        request_spec=request_spec,
                        filter_properties=filter_properties,
                        legacy_bdm_in_spec=legacy_bdm_in_spec)
            except Exception as ex:
                # NOTE(vish): we don't reraise the exception here to make sure
                #             that all instances in the request get set to
                #             error properly
                driver.handle_schedule_error(context, ex, instance_uuid,
                                             request_spec)

该方法首先获取所有需要启动的云主机列表uuid,对每个待启动云主机调用_schedule方法,该方法返回一个宿主机,更新数据库,然后调用nova-compute的远程方法api(rpcapi)调用run_instance方法,在选择的宿主机中启动该云主机,nova-scheduler任务完成。下面看看_schedule实现:

def _schedule(self, context, topic, request_spec, filter_properties):
        """Picks a host that is up at random."""

        elevated = context.elevated()
        hosts = self.hosts_up(elevated, topic) # 父类Schduler方法,返回所有nova-compute状态为up的主机列表。
        if not hosts:
            msg = _("Is the appropriate service running?")
            raise exception.NoValidHost(reason=msg)

        hosts = self._filter_hosts(request_spec, hosts, filter_properties) # 过滤一些主机黑名单列表。
        if not hosts:
            msg = _("Could not find another compute")
            raise exception.NoValidHost(reason=msg)

        return random.choice(hosts) # 随机返回其中一个主机

该方法非常简单,首先获取所有服务状态为up的宿主机列表,然后过滤一些黑名单,最后调用random.choice方法随机从中返回一个宿主机。

文中只对简单chance算法进行了简单叙述,而filter算法由于比较复杂,后面以另一篇文章进行叙述。

 

转载请注明:http://krystism.is-programmer.com/若有错误,请多多指正,谢谢!
  • 无匹配
  • 无匹配

登录 *


loading captcha image...
(输入验证码)
or Ctrl+Enter