编辑/etc/libvirt/libvirtd.conf
listen_tls = 0
listen_tcp = 1
auth_tcp="none"
tcp_port = "16509"
即使设置了listen_tcp
也不会开启监听服务, 重启libvirt-bin
服务,验证:
sudo netstat -nlpt
# 可见16509端口并没有开启
ps aux | grep libvirt
# 可见libvirtd没有-l参数
需要开启监听服务,设置/etc/init/libvirt-bin.conf
文件,设置exec /usr/sbin/libvirtd $libvirtd_opts -l
,注意后面的-l
选项 不能直接写在libvirtd_opts
上,不生晓,原因不明
重启libvirt-bin
服务, 使用netstat
是否开启了tcp端口和ps
查看libvirtd是否有-l
选项,都没有问题后,运行:
virsh --connect qemu+tcp://node1/system list
其中node1
为主机名,如果无错误,则表示正常开启tcp监听服务.
OpenStack是一个云平台管理的项目,由几个主要的组件组合起来,旨在为公共及私有云的建设与管理提供平台。它是由 Rackspace 和 NASA 共同开发的云计算平台,帮助服务商和企业内部实现类似于 Amazon EC2 和 S3 的云基础架构服务(Infrastructure as a Service, IaaS),具体查看openstack官网
这个项目旨在写一个方便调用、方便扩展的openstack开发包。它是根据openstack api封装的java开发库,最初应实验室需求而开发。目前已经花了陆陆续续近3个月的时间,仅仅实现了基本功能,还有很多bug,以及功能尚未实现。
项目地址:https://github.com/krystism/openstack-java-sdk
首先使用maven构建本项目:
mvn package
如果需要把依赖包一块打进一个整体包,使用一下命令:
mvn assembly:assembly
项目运行需要配置文件,配置文件路径由系统变量OPENSTACK_CONF_PATH
指定,默认为/etc/openstack
,参考配置文件说明 我学习了openstack官方python库的设计架构,尽力做到调用简单,方便.以下是demo:
OpenstackSession session = OpenstackSession.getSession("username", "password");// get session
Nova nova = session.getNovaClient(); // get nova client
// get a flavor list, print their name
for (Flavor flavor : nova.flavors.list()) {
System.out.println(flavor.getName());
}
// create a new server
Server server = new Server();
server.setName("demo");
server.setImageRef("imageId");
server.setFlavorRef("flavorId");
// some other config
nova.servers.create(server); // call create method to execute.
由于时间有限,加上本人代码功底尚浅,需要完美完成所有功能实在是感觉力不从心。希望有更多的大牛能够加入一起完成,并且指出我目前工作的不足,甚至重构整个代码,我真诚的感谢每一位热心的朋友!
下面讲讲我目前如何扩展新功能:
model就是实体对应的java bean,接收请求过来的数据一般是json,即要把json表示的对象转化成的java对象。你可以完全由自己来定义自己的bean,我的所有bean都继承AbstractEntity,属性由Property注解,构造方法传递一个JSONObject对象,一般直接调用基类的构造方法就能完成JSONObject到java bean的转化。以下是model demo:
@Entity("volume_type")
public class VolumeType extends AbstractEntity {
private static final long serialVersionUID = -6539238104579991330L;
@Property("extra_specs")
private JSONObject metadata;
/*
@Property("name")
private String name;
@Property("id")
private String id;
*/
public VolumeType() {
super();
}
public VolumeType(JSONObject jsonObj) {
super(jsonObj);
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
// some other getter and setter here
}
管理接口即对bean定义操作,这些操作可能根据api的版本不同有不同的实现。在设计接口时,我没有使用共同接口,也尚未使用接口继承,原因是很多实体bean对应的操作差异性很大,即使是相同的操作,参数也不完全相同。下面是一个管理接口demo:
public interface ServerManager {
/**
* Get a server.
* @param id ID of the Server to get.
* @return Server
* @throws OperationException
*/
Server get(String id) throws OperationException;
/**
* Get a list of servers.
* @return List of server
* @throws OperationException
*/
List<Server> list() throws OperationException;
/**
* Stop(power off) the server
* @param id The id of server to stop.
* @throws OperationException
*/
void stop(String id) throws OperationException;
/**
* Start(power on) the server,
* @param id The id of server to start.
* @throws OperationException
*/
void start(String id) throws OperationException;
/**
* Reboot a server, for a software-lever reboot
* @param id The ID of server to reboot.
* @see reboot(String id, boolean hard)
*/
void reboot(String id) throws OperationException;
/**
* update the name for a Server
* @param id the id of server to rename
* @param name the new name of the server.
* @throws OperationException
*/
void rename(String id, String name) throws OperationException;
/**
* Create (boot) a new Server.<br/>
* <i>Remember</i> : You must set name, imageRef, flavorRef !
* @param instance The new server you have created.
* @return
* @throws OperationException
*/
Server create(Server instance) throws OperationException;
/**
* Delete (i.e shut down and delete the image) this server.
* @param id The ID of server to delete.
* @throws OperationException
*/
void delete(String id) throws OperationException;
}
接下来就是具体实现接口,可能会根据api版本不同而有不同的实现,实际应用时可以根据配置文件指定需要使用的版本。下面是一个实现Demo:
public class Flavors extends AbstractManager<Flavor> implements FlavorManager{
private final String PREFIX = "/flavors";
public Flavors(Authenticated credentical) {
super(credentical, Flavor.class);
}
/**
* Get a list of all flavors
* @return A List , which holds flavors
* @throws OperationException
*/
@Override
public List<Flavor> list() throws OperationException {
return _list(PREFIX + "/detail");
}
/**
* Get a specific Flavor.
* @param id The ID of Flavor to get.
* @return Flavor
* @throws OperationException
*/
@Override
public Flavor get(String id) throws OperationException {
return _get(PREFIX + "/" + id);
}
/**
* Delete a specific Flavor.
* @param id The ID of Flavor to delete.
* @throws OperationException
*/
@Override
public void delete(String id) throws OperationException {
_delete("/flavors/" + id);
}
/**
* create a new flavor for a tenant
* @param flavor The flavor to create
* @return The new flavor
* @throws OperationException
*/
@Override
public Flavor create(Flavor flavor) throws OperationException {
return _create("/flavors", flavor);
}
}
注册新功能就是把相应的管理接口添加到client下(比如Nova, Glance等),目前我直接硬编码,实际操作应该根据配置文件选择api版本由工厂负责创建,以下是demo:
public class Nova {
public final FlavorManager flavors;
public final HypervisorManager hypervisors;
public final ServerManager servers;
public final KeyPairManager keypairs;
public Nova(Authenticated credentical) {
// bad work, don't do that!!
flavors = new Flavors(credentical);
hypervisors = new Hypervisors(credentical);
servers = new Servers(credentical);
keypairs = new KeyPairs(credentical);
}
如何为已经实现的接口增加新功能,比如在请求前记录日志,或者使用cache。只需要增加装饰器即可!以下是Demo:
public class FlavorCachedManager implements FlavorManager {
private FlavorManager flavors;
public FlavorCachedManager(FlavorManager flavors) {
this.flavors = flavors;
}
@Override
public Flavor get(String id) throws OperationException {
Flavor flavor = getFromCache(id);
if (flavor == null) {
flavor = flavors.get(id);
}
addToCache(flavor);
return flavor;
}
}
The MIT License (MIT)
当做虚拟机备份快照时,镜像的大小一般会大于实际数据大小,you'll need to zero out all free space of the partitions contained within the guest first.
参考维基百科:https://pve.proxmox.com/wiki/Shrin_Qcow2_Disk_Files
针对linux镜像:
尽量删除一些无用文件
dd if=/dev/zero of=/mytempfile
# that could take a some time
rm -f /mytempfile
mv image.qcow2 image.qcow2_backup
qemu-img convert -O qcow2 image.qcow2_backup image.qcow2
qemu-img convert -O qcow2 -c image.qcow2_backup image.qcow2
由于电信需要远程部署软件,拟使用puppet进行远程服务管理
在没有安装cloudinit情况下需要手动配置mtu大小为1454,否则无法上网,仅在使用neutron并且使用gre隧道情况下需要设置。
把一些常用的host放到C:/windows/system32/driver/etc/host,尤其是master的,务必设置!!
agent需要和master保持时间同步,因此需要设置windows更新时间服务器为master
需要安装win32-dir 版本大于0.43, 安装方法为管理员start command prompt with puppet, run:
gem install win32-dir
timezone默认输出中文,会出现编码错误,设置setcode 为英文字符,最好是世界时区标准格式
下载地址:官网
在这前清空puppet ssl目录,位于C:/programdata,并根据实际情况可以设置puppet.conf 。为了减少磁盘文件大小,最好运行下磁盘 清理,并删除掉一些无用文件。
把格式转化,一方面为了合并base image,另一方面也可以起到重新整理磁盘文件,减少文件大小,run:
qemu-img convert -O qcow2 origin.qcow2 new.qcow2
把镜像上传到glance下,记得设置os_type = windows否则,会出现RTC时间问题
启动云主机,记得设置os_type = windows, 原因同上!
上篇文章讲了nova-scheduler:openstack之nova-scheduler 。为了简单,只叙述了随机调度算法,而没有细讲filter调度算法。filter调度算法原理并不难,先层层过滤掉一些不满足条件的宿主机,然后对剩余的宿主机进行weight评分排序,多个weight得到的分数进行累加,分数较高的(注意不一定是最高,原因下面讲)作为侯选宿主机。具体算法描述可以查看官方文档:http://docs.openstack.org/trunk/config-reference/content/section_compute-scheduler.html 下面从源码逐步分析算法的运行过程。
首先看看schedule_run_instance方法:
def schedule_run_instance(self, context, request_spec, admin_password, injected_files, requested_networks, is_first_time, filter_properties, legacy_bdm_in_spec): """This method is called from nova.compute.api to provision an instance. We first create a build plan (a list of WeightedHosts) and then provision. Returns a list of the instances created. """ payload = dict(request_spec=request_spec) self.notifier.info(context, 'scheduler.run_instance.start', payload) instance_uuids = request_spec.get('instance_uuids') # 获取uuids,可有多个 LOG.info(_("Attempting to build %(num_instances)d instance(s) " "uuids: %(instance_uuids)s"), {'num_instances': len(instance_uuids), 'instance_uuids': instance_uuids}) LOG.debug(_("Request Spec: %s") % request_spec) # 返回主机列表 weighed_hosts = self._schedule(context, request_spec, filter_properties, instance_uuids) # NOTE: Pop instance_uuids as individual creates do not need the # set of uuids. Do not pop before here as the upper exception # handler fo NoValidHost needs the uuid to set error state instance_uuids = request_spec.pop('instance_uuids') # 弹出uuids,不再需要 # NOTE(comstud): Make sure we do not pass this through. It # contains an instance of RpcContext that cannot be serialized. filter_properties.pop('context', None) for num, instance_uuid in enumerate(instance_uuids): request_spec['instance_properties']['launch_index'] = num try: try: weighed_host = weighed_hosts.pop(0) # 弹出第一个主机 LOG.info(_("Choosing host %(weighed_host)s " "for instance %(instance_uuid)s"), {'weighed_host': weighed_host, 'instance_uuid': instance_uuid}) except IndexError: raise exception.NoValidHost(reason="") self._provision_resource(context, weighed_host, request_spec, filter_properties, requested_networks, injected_files, admin_password, is_first_time, instance_uuid=instance_uuid, legacy_bdm_in_spec=legacy_bdm_in_spec) except Exception as ex: # NOTE(vish): we don't reraise the exception here to make sure # that all instances in the request get set to # error properly driver.handle_schedule_error(context, ex, instance_uuid, request_spec) # scrub retry host list in case we're scheduling multiple # instances: retry = filter_properties.get('retry', {}) retry['hosts'] = [] self.notifier.info(context, 'scheduler.run_instance.end', payload)
该方法在进行一些参数处理后,首先调用_schedule方法,该方法返回宿主机列表,然后对每个待启动云主机调用_provision_resource方法,并把对应的目标宿主机传入该方法。_provision_resource方法的任务是更新数据库和调用nova-compute的rpcapi指定目标宿主机启动云主机。核心方法是_schedule方法,
def _schedule(self, context, request_spec, filter_properties, instance_uuids=None): """Returns a list of hosts that meet the required specs, ordered by their fitness. """ elevated = context.elevated() instance_properties = request_spec['instance_properties'] instance_type = request_spec.get("instance_type", None) # get flavor # Get the group update_group_hosts = False scheduler_hints = filter_properties.get('scheduler_hints') or {} group = scheduler_hints.get('group', None) # --hint group SERVER_GROUP, 如果有group,则更新到数据库中 if group: group_hosts = self.group_hosts(elevated, group) update_group_hosts = True if 'group_hosts' not in filter_properties: filter_properties.update({'group_hosts': []}) configured_hosts = filter_properties['group_hosts'] filter_properties['group_hosts'] = configured_hosts + group_hosts config_options = self._get_configuration_options() # check retry policy. Rather ugly use of instance_uuids[0]... # but if we've exceeded max retries... then we really only # have a single instance. properties = instance_properties.copy() if instance_uuids: properties['uuid'] = instance_uuids[0] self._populate_retry(filter_properties, properties) # 如果超出最多尝试次数,抛出NoValidHost异常 filter_properties.update({'context': context, 'request_spec': request_spec, 'config_options': config_options, 'instance_type': instance_type}) self.populate_filter_properties(request_spec, # 把一些数据填入filter_properties中,比如project_id, os_type等 filter_properties) # Find our local list of acceptable hosts by repeatedly # filtering and weighing our options. Each time we choose a # host, we virtually consume resources on it so subsequent # selections can adjust accordingly. # Note: remember, we are using an iterator here. So only # traverse this list once. This can bite you if the hosts # are being scanned in a filter or weighing function. hosts = self.host_manager.get_all_host_states(elevated) # 获取所有主机列表,host_manager从父类init方法获取,根据CONF获取,默认为nova.scheduler.host_manager.HostManager,直接读取数据库 selected_hosts = [] if instance_uuids: num_instances = len(instance_uuids) else: num_instances = request_spec.get('num_instances', 1) # 注意range和xrange区别,range返回一个list,而xrange返回一个生成器 for num in xrange(num_instances): # Filter local hosts based on requirements ... hosts = self.host_manager.get_filtered_hosts(hosts, filter_properties, index=num) if not hosts: # Can't get any more locally. break LOG.debug(_("Filtered %(hosts)s"), {'hosts': hosts}) weighed_hosts = self.host_manager.get_weighed_hosts(hosts, # 获取weight值,并按大到小排序 filter_properties) LOG.debug(_("Weighed %(hosts)s"), {'hosts': weighed_hosts}) scheduler_host_subset_size = CONF.scheduler_host_subset_size # 截取集合到指定大小。 if scheduler_host_subset_size > len(weighed_hosts): scheduler_host_subset_size = len(weighed_hosts) if scheduler_host_subset_size < 1: scheduler_host_subset_size = 1 chosen_host = random.choice( weighed_hosts[0:scheduler_host_subset_size]) # 从截取的集合中随机选择一个作为目标宿主机,而不是一定是最大的。 selected_hosts.append(chosen_host) # Now consume the resources so the filter/weights # will change for the next instance. chosen_host.obj.consume_from_instance(instance_properties) # 更新值,为下一个主机调度做准备 if update_group_hosts is True: filter_properties['group_hosts'].append(chosen_host.obj.host) return selected_hosts
该方法的两个核心方法是host_manager.get_filtered_hosts和host_manager.get_weighed_hosts方法,分别对应算法的过滤和计算权值两个过程。注意在计算权值后返回的是一个排好序的主机列表,但并不是选择其中一个最大值的作为目标宿主机,而是通过配置指定从topN中随机选择一个,比如设置scheduler_host_subset_size为5,过滤后返回的主机个数为10,则从top5中随机返回其中一个,这就是前面讲的为什么不是分值最高,而是较高。host_manager为可配置的,默nova.scheduler.host_manager.HostManager,
HostManagerd get_filtered_hosts主要调用两个方法:_choose_host_filters和filter_handler.get_filtered_objects,前者通过过滤器类名返回对应的类列表(相当于java中根据类名,比如"Apple",找到对应的类,比如a.b.Apple.class,或者getClass("Apple"),过滤器类名通过nova.conf的scheduler_default_filters配置,默认为RetryFilter','AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter'。然后把类列表传递给filter_handler.get_filtered_objects方法,filte_handle是filters.HostFilterHandler,而HostFilterHandler继承自nova.filters.BaseFilterHandler,其实现为:
class BaseFilterHandler(loadables.BaseLoader): """Base class to handle loading filter classes. This class should be subclassed where one needs to use filters. """ def get_filtered_objects(self, filter_classes, objs, filter_properties, index=0): list_objs = list(objs) LOG.debug(_("Starting with %d host(s)"), len(list_objs)) for filter_cls in filter_classes: cls_name = filter_cls.__name__ filter = filter_cls() if filter.run_filter_for_index(index): objs = filter.filter_all(list_objs, filter_properties) if objs is None: LOG.debug(_("Filter %(cls_name)s says to stop filtering"), {'cls_name': cls_name}) return list_objs = list(objs) LOG.debug(_("Filter %(cls_name)s returned " "%(obj_len)d host(s)"), {'cls_name': cls_name, 'obj_len': len(list_objs)}) if len(list_objs) == 0: break return list_objs
可见他会遍历所有的过滤类,实例化,并且调用它的filter_all方法,最后返回所有未被拦截的对象。下面我们看看过滤类:
我们上编文章说了,过滤器是可插除的,如果我们要自定义自己的过滤器只需要继承BaseHostFilter(在nova.schedule.filters.__init__.py中定义)并实现host_passes方法即可,如下代码:
class BaseHostFilter(filters.BaseFilter): """Base class for host filters.""" def _filter_one(self, obj, filter_properties): """Return True if the object passes the filter, otherwise False.""" return self.host_passes(obj, filter_properties) def host_passes(self, host_state, filter_properties): """Return True if the HostState passes the filter, otherwise False. Override this in a subclass. """ raise NotImplementedError()
可见BaseHostFilter继承filters.BaseFilter,代码:
class BaseFilter(object): """Base class for all filter classes.""" def _filter_one(self, obj, filter_properties): """Return True if it passes the filter, False otherwise. Override this in a subclass. """ return True def filter_all(self, filter_obj_list, filter_properties): """Yield objects that pass the filter. Can be overriden in a subclass, if you need to base filtering decisions on all objects. Otherwise, one can just override _filter_one() to filter a single object. """ for obj in filter_obj_list: if self._filter_one(obj, filter_properties): yield obj # Set to true in a subclass if a filter only needs to be run once # for each request rather than for each instance run_filter_once_per_request = False def run_filter_for_index(self, index): """Return True if the filter needs to be run for the "index-th" instance in a request. Only need to override this if a filter needs anything other than "first only" or "all" behaviour. """ if self.run_filter_once_per_request and index > 0: return False else: return True
我们只需要关注两个方法_filter_one和filter_all,_filter_one传入过滤对象和过滤参数,返回bool类型,通过返回True,拦截返回False,而filer_all是传入一个对象集合,通过调用_filter_one产生一个通过过滤器的元素生成器。因此我们只需要重写_filter_one即可,而BaseHostFilter的_filter_one调用host_passes,因此只需要重写host_passes方法。
filterHandle正是调用了filter类的filter_all方法。
filter过程到此结束,下面看看weight过程,回到_schedule方法,调用了host_manager.get_weighed_hosts,而host_manager调用了weight_handler.get_weighed_objects方法,weight_handle是HostWeightHandler实例,该类继承自nova.weights.BaseWeightHandler,其实现为:
class BaseWeightHandler(loadables.BaseLoader): object_class = WeighedObject def get_weighed_objects(self, weigher_classes, obj_list, weighing_properties): """Return a sorted (highest score first) list of WeighedObjects.""" if not obj_list: return [] weighed_objs = [self.object_class(obj, 0.0) for obj in obj_list] for weigher_cls in weigher_classes: weigher = weigher_cls() weigher.weigh_objects(weighed_objs, weighing_properties) return sorted(weighed_objs, key=lambda x: x.weight, reverse=True)
和过滤过程类似,也是遍历所有的weighed类,调用它的weigh_objects方法,得到一个weight值,再和之前的值累加。weight_objects方法会调用_weight_object和_weight_multiplier方法,前者对应分值,后者对应权值,二者的乘积就是最后的分值。因此weighed类必须实现_weigh_objects和_weight_multiplier方法,最后再通过weight值排序返回。如果要自定义weight类,只需继承BaseHostWeigher,重写 _weigh_object和_weight_multiplier方法,得到的值就是他们的乘积。
nova-scheduler的功能是负责从多宿主机中调度最适合的宿主机生成云主机。即传入需要启动的云主机列表,nova-scheduler根据云主机的数量、参数等进行调度,选择合适的物理机(hypervisor,宿主机,即运行nova-compute的节点)启动这些云主机。在H版本中实现的调度算法有两个,即过滤(filter)调度算法和随机调度算法(chance)。目前的默认调度算法是 filter-scheduler,即过滤调度器,其思想是先进行一些条件过滤一些宿主机,比如要求可用内存大于2GB,小于2GB的直接过滤,过滤器可以串联多个,即层层过滤。然后对过滤后的宿主机进行权值计算,权值计算就是根据宿主机的状态进行评分(weight),最后根据评分(weight)排序,评分最高的为最佳候选宿主机,评分也是可以串联的,即层层评分。注意openstack的设计原则是可扩展,意味着调度算法、过滤函数、评分函数都是可插除的,用户可以自定义自己的调度器,过滤器,评分方法,而只需在配置文件中配置即可,无需修改核心代码。实现的过滤器很多,而评分函数目前只有内存评分,即根据内存使用量进行评分。
注意:启动云主机时宿主机的位置并不是完全由scheduler控制,用户可以指定availability-zone,aggregate以及通过设置--hint来控制宿主机在某个集合中(本质还是过滤,即通过用户设定的条件进行过滤)。
下面从入口manager开始看看nova-scheduler如何工作的,部分代码:
def run_instance(self, context, request_spec, admin_password, injected_files, requested_networks, is_first_time, filter_properties, legacy_bdm_in_spec=True): """Tries to call schedule_run_instance on the driver. Sets instance vm_state to ERROR on exceptions """ instance_uuids = request_spec['instance_uuids'] with compute_utils.EventReporter(context, conductor_api.LocalAPI(), 'schedule', *instance_uuids): try: return self.driver.schedule_run_instance(context, request_spec, admin_password, injected_files, requested_networks, is_first_time, filter_properties, legacy_bdm_in_spec) except exception.NoValidHost as ex: # don't re-raise self._set_vm_state_and_notify('run_instance', {'vm_state': vm_states.ERROR, 'task_state': None}, context, ex, request_spec) except Exception as ex: with excutils.save_and_reraise_exception(): self._set_vm_state_and_notify('run_instance', {'vm_state': vm_states.ERROR, 'task_state': None}, context, ex, request_spec)
方法先获取需要创建云主机的uuid,然后直接调用driver的schedule_run_instance,这个driver即调度器,所有的调度器必须继承自driver.Schduler, 并且实现三个抽象方法:schedule_run_instance,select_destinations,select_hosts。driver是由配置文件配置,默认为nova.scheduler.filter_scheduler.FilterScheduler,如下:
scheduler_driver_opt = cfg.StrOpt('scheduler_driver', default='nova.scheduler.filter_scheduler.FilterScheduler', help='Default driver to use for the scheduler') CONF = cfg.CONF CONF.register_opt(scheduler_driver_opt) class SchedulerManager(manager.Manager): """Chooses a host to run instances on.""" def __init__(self, scheduler_driver=None, *args, **kwargs): if not scheduler_driver: scheduler_driver = CONF.scheduler_driver self.driver = importutils.import_object(scheduler_driver) self.compute_rpcapi = compute_rpcapi.ComputeAPI() super(SchedulerManager, self).__init__(service_name='scheduler', *args, **kwargs) ### 省略其他代码
由此可见入口函数直接调用driver(调度器)的scheduler_run_instance方法,为了简单起见,下面以chance调度器为例,看看它如何工作。首先查看chance调度器的scheduler_run_instance代码:
def schedule_run_instance(self, context, request_spec, admin_password, injected_files, requested_networks, is_first_time, filter_properties, legacy_bdm_in_spec): """Create and run an instance or instances.""" instance_uuids = request_spec.get('instance_uuids') for num, instance_uuid in enumerate(instance_uuids): request_spec['instance_properties']['launch_index'] = num try: host = self._schedule(context, CONF.compute_topic, request_spec, filter_properties) updated_instance = driver.instance_update_db(context, instance_uuid) self.compute_rpcapi.run_instance(context, instance=updated_instance, host=host, requested_networks=requested_networks, injected_files=injected_files, admin_password=admin_password, is_first_time=is_first_time, request_spec=request_spec, filter_properties=filter_properties, legacy_bdm_in_spec=legacy_bdm_in_spec) except Exception as ex: # NOTE(vish): we don't reraise the exception here to make sure # that all instances in the request get set to # error properly driver.handle_schedule_error(context, ex, instance_uuid, request_spec)
该方法首先获取所有需要启动的云主机列表uuid,对每个待启动云主机调用_schedule方法,该方法返回一个宿主机,更新数据库,然后调用nova-compute的远程方法api(rpcapi)调用run_instance方法,在选择的宿主机中启动该云主机,nova-scheduler任务完成。下面看看_schedule实现:
def _schedule(self, context, topic, request_spec, filter_properties): """Picks a host that is up at random.""" elevated = context.elevated() hosts = self.hosts_up(elevated, topic) # 父类Schduler方法,返回所有nova-compute状态为up的主机列表。 if not hosts: msg = _("Is the appropriate service running?") raise exception.NoValidHost(reason=msg) hosts = self._filter_hosts(request_spec, hosts, filter_properties) # 过滤一些主机黑名单列表。 if not hosts: msg = _("Could not find another compute") raise exception.NoValidHost(reason=msg) return random.choice(hosts) # 随机返回其中一个主机
该方法非常简单,首先获取所有服务状态为up的宿主机列表,然后过滤一些黑名单,最后调用random.choice方法随机从中返回一个宿主机。
文中只对简单chance算法进行了简单叙述,而filter算法由于比较复杂,后面以另一篇文章进行叙述。
要想知道nova的工作过程,首先就要掌握它的入口,即novaclient!命令nova和horizon都调用了novaclient。
github地址:https://github.com/openstack/python-novaclient
novaclient的功能很简单,即解析参数,构造url并发送请求,处理结果。比如运行nova --debug list,首先需要解析出选项参数--debug,另外还要获取环境变量参数和默认参数,然后解析子命令list,通过子命令获取相对应的回调函数,list对应为novaclient.v1_1.shell.do_list。
下面详细看看它的工作原理,首先看看命令nova到底是什么?
which nova | xargs -I{} file {} # 返回/usr/bin/nova: a /usr/bin/python script, ASCII text executable
可见命令nova只是一个python程序,让我们打开它
#!/usr/bin/python # PBR Generated from 'console_scripts' import sys from novaclient.shell import main if __name__ == "__main__": sys.exit(main())
命令nova调用了novaclient.shell的main函数,从这里开始进入了novaclient,现在让我们开始novaclient吧!
首先看看novaclient.shell的main函数:
def main(): """入口函数""" try: OpenStackComputeShell().main(map(strutils.safe_decode, sys.argv[1:])) except Exception as e: logger.debug(e, exc_info=1) print("ERROR: %s" % strutils.safe_encode(six.text_type(e)), file=sys.stderr) sys.exit(1)
发现它又调用了OpenstackComputeShell()的main函数。这个main函数才是真正的入口函数,以下是前半部分代码:
def main(self, argv): # Parse args once to find version and debug settings parser = self.get_base_parser() # 添加选项,比如--user, --password等 (options, args) = parser.parse_known_args(argv) self.setup_debugging(options.debug) # 如果options中有--debug,则设置logger的level为DEBUG,并输出到标准输出流 # Discover available auth plugins novaclient.auth_plugin.discover_auth_systems() # build available subcommands based on version self.extensions = self._discover_extensions( options.os_compute_api_version) self._run_extension_hooks('__pre_parse_args__') # NOTE(dtroyer): Hackery to handle --endpoint_type due to argparse # thinking usage-list --end is ambiguous; but it # works fine with only --endpoint-type present # Go figure. if '--endpoint_type' in argv: spot = argv.index('--endpoint_type') argv[spot] = '--endpoint-type' # 根据版本解析子命令 subcommand_parser = self.get_subcommand_parser( options.os_compute_api_version) self.parser = subcommand_parser # 如果--help,则打印help信息,并退出 if options.help or not argv: subcommand_parser.print_help() return 0 args = subcommand_parser.parse_args(argv) #解析命令行参数 argv=['list'] #print("args = %s" % args) self._run_extension_hooks('__post_parse_args__', args) # Short-circuit and deal with help right away. # nova help xxxx 命令 if args.func == self.do_help: self.do_help(args) return 0 # nova bash-completion elif args.func == self.do_bash_completion: self.do_bash_completion(args) return 0
parser是 NovaClientArgumentParser类型,该类型继承自argparse.ArgumentParser,argparse是python中的参数解析库。
get_base_parser方法即添加选项参数,诸如--debug, --timing,--os-username 等等,并会读取环境变量和设置默认值,下面是部分代码:
# Global arguments parser.add_argument('-h', '--help', action='store_true', help=argparse.SUPPRESS, ) parser.add_argument('--version', action='version', version=novaclient.__version__) parser.add_argument('--debug', default=False, action='store_true', help="Print debugging output") parser.add_argument('--no-cache', default=not utils.bool_from_str( utils.env('OS_NO_CACHE', default='true')), action='store_false', dest='os_cache', help=argparse.SUPPRESS) parser.add_argument('--no_cache', action='store_false', dest='os_cache', help=argparse.SUPPRESS) parser.add_argument('--os-cache', default=utils.env('OS_CACHE', default=False), action='store_true', help="Use the auth token cache.") parser.add_argument('--timings', default=False, action='store_true', help="Print call timing info") parser.add_argument('--timeout', default=600, metavar='<seconds>', type=positive_non_zero_float, help="Set HTTP call timeout (in seconds)")
用过argparse库的一定不会陌生了。
回到main函数,接下来会设置debug,即如果有--debug选项,则设置logger的level为DEBUG并传入到标准输出流。
(options, args) = parser.parse_known_args(argv)返回解析结果,即options保存所有的选项参数,args保存位置参数,比如nova --debug list, options.debug等于True,args为['list']。
下一个函数get_subcommand_parser是一个核心方法,用于处理子命令比如list, flavor-list, boot等,以下是代码:
def get_subcommand_parser(self, version): parser = self.get_base_parser() self.subcommands = {} subparsers = parser.add_subparsers(metavar='<subcommand>') try: actions_module = { '1.1': shell_v1_1, '2': shell_v1_1, '3': shell_v3, }[version] except KeyError: actions_module = shell_v1_1 #默认是1.1版本 self._find_actions(subparsers, actions_module) self._find_actions(subparsers, self) for extension in self.extensions: self._find_actions(subparsers, extension.module) self._add_bash_completion_subparser(subparsers) return parser
这个方法是根据版本(默认是1.1)寻找可用的方法,我们假设使用shell_v1_1模块,它导入自from novaclient.v1_1 import shell as shell_v1_1,然后调用_find_actions方法。注意:这个方法传入的是一个模块,python中所有东西都是对象,模块也不例外,不过这里我们姑且认为它传入了一个类,类似与java的XXXClass.class类型,以下是代码:
def _find_actions(self, subparsers, actions_module): # actions_module = shell_v1.1 for attr in (a for a in dir(actions_module) if a.startswith('do_')): # attr = do_flavor_list # I prefer to be hypen-separated instead of underscores. command = attr[3:].replace('_', '-') # do_flavor_list -> flavor-list callback = getattr(actions_module, attr) desc = callback.__doc__ or '' action_help = desc.strip() arguments = getattr(callback, 'arguments', []) subparser = subparsers.add_parser(command, help=action_help, description=desc, add_help=False, formatter_class=OpenStackHelpFormatter ) subparser.add_argument('-h', '--help', action='help', help=argparse.SUPPRESS, ) self.subcommands[command] = subparser for (args, kwargs) in arguments: subparser.add_argument(*args, **kwargs) subparser.set_defaults(func=callback)
可见这个方法是利用反射机制获取所有以do_开头的方法,这个do_XXX_XXX,XXX-XXX就是命令名,而do_XXX_XXX就是回调函数,把函数作为变量赋值给callback,是函数式编程的经典用法。最后把callback传入set_defaults方法。
至此我们知道nova list其实调用了novaclient.v1_1.shell.do_list()方法,而nova flavor-list调用了novaclient.v1_1.shell.do_flavor_list()方法,下面以nova --debug flavor-list为例继续深入。
我们看novaclient.v1_1.shell源码,发现好多do_XXX方法,但它本身并不做什么工作,而是调用cs去做,cs是什么现在不管。下面是do_flavor_list方法:
def do_flavor_list(cs, args): """Print a list of available 'flavors' (sizes of servers).""" if args.all: flavors = cs.flavors.list(is_public=None) else: flavors = cs.flavors.list() _print_flavor_list(flavors, args.extra_specs)
现在我们不知道cs是什么东西,那我们继续回到main函数,main函数中间其余代码均是在各种参数检查,我们忽略不管,直接跳到main函数结尾
def main(self, argv): # Parse args once to find version and debug settings parser = self.get_base_parser() # 添加选项,比如--user, --password等 (options, args) = parser.parse_known_args(argv) self.setup_debugging(options.debug) # 如果options中有--debug,则设置logger的level为DEBUG,并输出到标准输出流 # Discover available auth plugins novaclient.auth_plugin.discover_auth_systems() # build available subcommands based on version self.extensions = self._discover_extensions( options.os_compute_api_version) self._run_extension_hooks('__pre_parse_args__') # NOTE(dtroyer): Hackery to handle --endpoint_type due to argparse # thinking usage-list --end is ambiguous; but it # works fine with only --endpoint-type present # Go figure. if '--endpoint_type' in argv: spot = argv.index('--endpoint_type') argv[spot] = '--endpoint-type' # 根据版本解析子命令 subcommand_parser = self.get_subcommand_parser( options.os_compute_api_version) self.parser = subcommand_parser # 如果--help,则打印help信息,并退出 if options.help or not argv: subcommand_parser.print_help() return 0 args = subcommand_parser.parse_args(argv) #解析命令行参数 argv=['list'] #print("args = %s" % args) self._run_extension_hooks('__post_parse_args__', args) # Short-circuit and deal with help right away. # nova help xxxx 命令 if args.func == self.do_help: self.do_help(args) return 0 # nova bash-completion elif args.func == self.do_bash_completion: self.do_bash_completion(args) return 0 # 这里省略大量代码 self.cs = client.Client(options.os_compute_api_version, os_username, os_password, os_tenant_name, tenant_id=os_tenant_id, auth_url=os_auth_url, insecure=insecure, region_name=os_region_name, endpoint_type=endpoint_type, extensions=self.extensions, service_type=service_type, service_name=service_name, auth_system=os_auth_system, auth_plugin=auth_plugin, volume_service_name=volume_service_name, timings=args.timings, bypass_url=bypass_url, os_cache=os_cache, http_log_debug=options.debug, cacert=cacert, timeout=timeout) # 这里省略大量代码 args.func(self.cs, args) # 此时func等于do_flavor_list if args.timings: #如果有--timing选项,则打印请求时间 self._dump_timings(self.cs.get_timings())
可见cs是调用client.Client方法返回的,我们查看其代码client.py:
def get_client_class(version): version_map = { '1.1': 'novaclient.v1_1.client.Client', '2': 'novaclient.v1_1.client.Client', '3': 'novaclient.v3.client.Client', } try: client_path = version_map[str(version)] except (KeyError, ValueError): msg = "Invalid client version '%s'. must be one of: %s" % ( (version, ', '.join(version_map.keys()))) raise exceptions.UnsupportedVersion(msg) return utils.import_class(client_path) def Client(version, *args, **kwargs): client_class = get_client_class(version) return client_class(*args, **kwargs)
不难看出cs即根据版本选择的Client类型,这里我们用的是novaclient.v1_1.client.Client。这个模块可以认为是功能模块的注册类,比如flavors操作模块为flavors.py,为了让他生效,必须注册,即在Client中设置self.flavors=flavors.FlavorManager(self):
self.projectid = project_id self.tenant_id = tenant_id self.flavors = flavors.FlavorManager(self) self.flavor_access = flavor_access.FlavorAccessManager(self) self.images = images.ImageManager(self) self.limits = limits.LimitsManager(self) self.servers = servers.ServerManager(self)
从do_flavor_list方法中cs.flavor.list()即调用了flavors.FlavorManager().list方法。从这里我们可以看出openstack的设计原则,即支持自由灵活的可扩展性,如果需要添加新功能,几乎不需要修改太多代码,只要修改Client注册即可。
我们查看flavors.py中的list方法:
def list(self, detailed=True, is_public=True): """ Get a list of all flavors. :rtype: list of :class:`Flavor`. """ qparams = {} # is_public is ternary - None means give all flavors. # By default Nova assumes True and gives admins public flavors # and flavors from their own projects only. if not is_public: qparams['is_public'] = is_public query_string = "?%s" % urlutils.urlencode(qparams) if qparams else "" detail = "" if detailed: detail = "/detail" return self._list("/flavors%s%s" % (detail, query_string), "flavors")
很明显这里是把list命令组装url请求,然后调用_list方法,由于FlavorManager继承自base.ManagerWithFind,而base.ManagerWithFind继承自Manager,_list方法在Manager中定义。
def _list(self, url, response_key, obj_class=None, body=None): if body: _resp, body = self.api.client.post(url, body=body) else: _resp, body = self.api.client.get(url) if obj_class is None: obj_class = self.resource_class data = body[response_key] # NOTE(ja): keystone returns values as list as {'values': [ ... ]} # unlike other services which just return the list... if isinstance(data, dict): try: data = data['values'] except KeyError: pass with self.completion_cache('human_id', obj_class, mode="w"): with self.completion_cache('uuid', obj_class, mode="w"): return [obj_class(self, res, loaded=True) for res in data if res]
有源码中看出主要发送url请求即self.api.client.post(url, body=dody)或者self.api.client.get(url),具体根据是否有body,即是否数据选择GET或者POST请求。然后处理返回的数据。self.api在这里其实就是novaclient.v1_1.client.Client,只是前面用cs,这里用api。
我们回到novaclient.v1_1.client.Client的方法中,我们发现除了注册一系列功能外,还有一个比较特殊的,
self.client = client.HTTPClient(username, password, projectid=project_id, tenant_id=tenant_id, auth_url=auth_url, insecure=insecure, timeout=timeout, auth_system=auth_system, auth_plugin=auth_plugin, proxy_token=proxy_token, proxy_tenant_id=proxy_tenant_id, region_name=region_name, endpoint_type=endpoint_type, service_type=service_type, service_name=service_name, volume_service_name=volume_service_name, timings=timings, bypass_url=bypass_url, os_cache=self.os_cache, http_log_debug=http_log_debug, cacert=cacert)
这个self.client是client.HTTPClient类型,真正负责发送url请求的类,部分代码为:
def _cs_request(self, url, method, **kwargs): if not self.management_url: self.authenticate() # Perform the request once. If we get a 401 back then it # might be because the auth token expired, so try to # re-authenticate and try again. If it still fails, bail. try: kwargs.setdefault('headers', {})['X-Auth-Token'] = self.auth_token if self.projectid: kwargs['headers']['X-Auth-Project-Id'] = self.projectid resp, body = self._time_request(self.management_url + url, method, **kwargs) return resp, body except exceptions.Unauthorized as e: try: # frist discard auth token, to avoid the possibly expired # token being re-used in the re-authentication attempt self.unauthenticate() self.authenticate() kwargs['headers']['X-Auth-Token'] = self.auth_token resp, body = self._time_request(self.management_url + url, method, **kwargs) return resp, body except exceptions.Unauthorized: raise e def get(self, url, **kwargs): return self._cs_request(url, 'GET', **kwargs)
最后会调用http.request方法发送请求,这里使用了python库Requests: HTTP for Humans,这个库比httplib2更好,查看地址:http://docs.python-requests.org/en/latest/。接收请求的工作就由nova-api负责了,这里不再深入。
接下来我们简单增加一个没用的功能test,首先在novaclient/v1_1下touch test.py,使用vim增加以下代码:
""" Test interface. """ from novaclient import base class Test(base.Resource): def test(self): print("This is a test") class TestManager(base.Manager): def test(self): print("This is a test")
然后我们需要在client中注册,编辑novaclient/v1_1/client.py文件,增加self.test = test.TestManager(self)
然后在shell下增加入口函数,注册新功能,
def do_test(cs, _args): """ do test. """ cs.test.test()
运行nova test, nova help test查看效果。
openstack热迁移有多种方式,下面配置block-migration方式,即以拷贝的方式实现热迁移,这种方式配置简单,但需要从源计算节点拷贝文件到目标计算节点,会比较耗时。而使用共享存储的方式,把所有的/var/lib/nova/instances下的虚拟机信息放到一块,比如放到控制节点,所有的计算使用nfs节点挂载该目录。这种方式迁移时不需要拷贝虚拟机文件,但所有的计算机节点的文件集中一起,管理不方便,配置方法见官方文档。
下面是block-migration方式配置:
首先在所有的计算节点中的nova.conf中增加live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_UNSAFE
开启热迁移功能。
然后需要配置versh免密码连接,修改/etc/libvirt/libvirtd.conf
下载官方原版iso镜像:http://pan.baidu.com/s/1c0owVc8
下载virtio驱动:http://alt.fedoraproject.org/pub/alt/virtio-win/archives/virtio-win-0.1-59/virtio-win-0.1-59.iso
创建一个虚拟qcow2盘:
kvm-img create -f qcow2 xp.qcow2 10G
安装xp到创建的虚拟盘中,有些教程说要加载软驱 virtio-win-xx.vfd和virtio驱动,实际上xp不需要,稍后我们再装!
kvm -m 1024 -cdrom xp.iso -drive file=xp.qcow2 -fda -boot d
安装好系统后,我们进入系统,并且安装virtio驱动,
kvm -hda xp.qcow2 \ -drive file=xp.qcow2,if=virtio \ -drive file=virtio-win-0.1-30.iso,media=cdrom,index=1 \ -net nic,model=virtio \ -net user \ -boot d \ -vga std \ -m 1024
进入xp系统,点击我的电脑->管理->设备管理,更新scsi和网卡驱动,注意scsi驱动必须安装,否则进入后会出现蓝屏。
驱动安装后,就可以上传到openstack中了
glance image-create --name xp --container-format=ovf --disk-format=qcow2 -file xp.qcow2 --progress
创建虚拟机时,根磁盘要大于创建虚拟磁盘的大小,临时磁盘对应一块新的未格式化的虚拟硬盘,swap不需要,创建成功后,进入系统。
此时c盘大小和创建虚拟盘大小一样,如果分配的磁盘大于虚拟盘大小,比如我们虚拟盘大小的10G,创建云主机时指定根磁盘大小为20G,此时需要使用磁盘扩展工具扩展c盘大小,http://pan.baidu.com/s/1eQh7q9c,如果有临时磁盘,则需要使用磁盘管理工具初始化磁盘,即我的电脑->管理->磁盘管理,然后格式化。
挂载新的云硬盘如果未被初始化,也需要进行磁盘初始化和格式化。
如果我们需要远程登陆,则还需要开启远程桌面功能,我一般还会关掉防火墙。
这是我制作的镜像,可以直接使用:http://pan.baidu.com/s/1pJEyVGZ