编辑/etc/libvirt/libvirtd.conf
listen_tls = 0
listen_tcp = 1
auth_tcp="none"
tcp_port = "16509"
即使设置了listen_tcp也不会开启监听服务, 重启libvirt-bin服务,验证:
sudo netstat -nlpt
# 可见16509端口并没有开启
ps aux | grep libvirt
# 可见libvirtd没有-l参数
需要开启监听服务,设置/etc/init/libvirt-bin.conf文件,设置exec /usr/sbin/libvirtd $libvirtd_opts -l,注意后面的-l选项 不能直接写在libvirtd_opts上,不生晓,原因不明
重启libvirt-bin服务, 使用netstat是否开启了tcp端口和ps查看libvirtd是否有-l选项,都没有问题后,运行:
virsh --connect qemu+tcp://node1/system list
其中node1为主机名,如果无错误,则表示正常开启tcp监听服务.
OpenStack是一个云平台管理的项目,由几个主要的组件组合起来,旨在为公共及私有云的建设与管理提供平台。它是由 Rackspace 和 NASA 共同开发的云计算平台,帮助服务商和企业内部实现类似于 Amazon EC2 和 S3 的云基础架构服务(Infrastructure as a Service, IaaS),具体查看openstack官网
这个项目旨在写一个方便调用、方便扩展的openstack开发包。它是根据openstack api封装的java开发库,最初应实验室需求而开发。目前已经花了陆陆续续近3个月的时间,仅仅实现了基本功能,还有很多bug,以及功能尚未实现。
项目地址:https://github.com/krystism/openstack-java-sdk
首先使用maven构建本项目:
mvn package
如果需要把依赖包一块打进一个整体包,使用一下命令:
mvn assembly:assembly
项目运行需要配置文件,配置文件路径由系统变量OPENSTACK_CONF_PATH指定,默认为/etc/openstack,参考配置文件说明 我学习了openstack官方python库的设计架构,尽力做到调用简单,方便.以下是demo:
OpenstackSession session = OpenstackSession.getSession("username", "password");// get session
Nova nova = session.getNovaClient(); // get nova client
// get a flavor list, print their name
for (Flavor flavor : nova.flavors.list()) {
System.out.println(flavor.getName());
}
// create a new server
Server server = new Server();
server.setName("demo");
server.setImageRef("imageId");
server.setFlavorRef("flavorId");
// some other config
nova.servers.create(server); // call create method to execute.
由于时间有限,加上本人代码功底尚浅,需要完美完成所有功能实在是感觉力不从心。希望有更多的大牛能够加入一起完成,并且指出我目前工作的不足,甚至重构整个代码,我真诚的感谢每一位热心的朋友!
下面讲讲我目前如何扩展新功能:
model就是实体对应的java bean,接收请求过来的数据一般是json,即要把json表示的对象转化成的java对象。你可以完全由自己来定义自己的bean,我的所有bean都继承AbstractEntity,属性由Property注解,构造方法传递一个JSONObject对象,一般直接调用基类的构造方法就能完成JSONObject到java bean的转化。以下是model demo:
@Entity("volume_type")
public class VolumeType extends AbstractEntity {
private static final long serialVersionUID = -6539238104579991330L;
@Property("extra_specs")
private JSONObject metadata;
/*
@Property("name")
private String name;
@Property("id")
private String id;
*/
public VolumeType() {
super();
}
public VolumeType(JSONObject jsonObj) {
super(jsonObj);
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
// some other getter and setter here
}
管理接口即对bean定义操作,这些操作可能根据api的版本不同有不同的实现。在设计接口时,我没有使用共同接口,也尚未使用接口继承,原因是很多实体bean对应的操作差异性很大,即使是相同的操作,参数也不完全相同。下面是一个管理接口demo:
public interface ServerManager {
/**
* Get a server.
* @param id ID of the Server to get.
* @return Server
* @throws OperationException
*/
Server get(String id) throws OperationException;
/**
* Get a list of servers.
* @return List of server
* @throws OperationException
*/
List<Server> list() throws OperationException;
/**
* Stop(power off) the server
* @param id The id of server to stop.
* @throws OperationException
*/
void stop(String id) throws OperationException;
/**
* Start(power on) the server,
* @param id The id of server to start.
* @throws OperationException
*/
void start(String id) throws OperationException;
/**
* Reboot a server, for a software-lever reboot
* @param id The ID of server to reboot.
* @see reboot(String id, boolean hard)
*/
void reboot(String id) throws OperationException;
/**
* update the name for a Server
* @param id the id of server to rename
* @param name the new name of the server.
* @throws OperationException
*/
void rename(String id, String name) throws OperationException;
/**
* Create (boot) a new Server.<br/>
* <i>Remember</i> : You must set name, imageRef, flavorRef !
* @param instance The new server you have created.
* @return
* @throws OperationException
*/
Server create(Server instance) throws OperationException;
/**
* Delete (i.e shut down and delete the image) this server.
* @param id The ID of server to delete.
* @throws OperationException
*/
void delete(String id) throws OperationException;
}
接下来就是具体实现接口,可能会根据api版本不同而有不同的实现,实际应用时可以根据配置文件指定需要使用的版本。下面是一个实现Demo:
public class Flavors extends AbstractManager<Flavor> implements FlavorManager{
private final String PREFIX = "/flavors";
public Flavors(Authenticated credentical) {
super(credentical, Flavor.class);
}
/**
* Get a list of all flavors
* @return A List , which holds flavors
* @throws OperationException
*/
@Override
public List<Flavor> list() throws OperationException {
return _list(PREFIX + "/detail");
}
/**
* Get a specific Flavor.
* @param id The ID of Flavor to get.
* @return Flavor
* @throws OperationException
*/
@Override
public Flavor get(String id) throws OperationException {
return _get(PREFIX + "/" + id);
}
/**
* Delete a specific Flavor.
* @param id The ID of Flavor to delete.
* @throws OperationException
*/
@Override
public void delete(String id) throws OperationException {
_delete("/flavors/" + id);
}
/**
* create a new flavor for a tenant
* @param flavor The flavor to create
* @return The new flavor
* @throws OperationException
*/
@Override
public Flavor create(Flavor flavor) throws OperationException {
return _create("/flavors", flavor);
}
}
注册新功能就是把相应的管理接口添加到client下(比如Nova, Glance等),目前我直接硬编码,实际操作应该根据配置文件选择api版本由工厂负责创建,以下是demo:
public class Nova {
public final FlavorManager flavors;
public final HypervisorManager hypervisors;
public final ServerManager servers;
public final KeyPairManager keypairs;
public Nova(Authenticated credentical) {
// bad work, don't do that!!
flavors = new Flavors(credentical);
hypervisors = new Hypervisors(credentical);
servers = new Servers(credentical);
keypairs = new KeyPairs(credentical);
}
如何为已经实现的接口增加新功能,比如在请求前记录日志,或者使用cache。只需要增加装饰器即可!以下是Demo:
public class FlavorCachedManager implements FlavorManager {
private FlavorManager flavors;
public FlavorCachedManager(FlavorManager flavors) {
this.flavors = flavors;
}
@Override
public Flavor get(String id) throws OperationException {
Flavor flavor = getFromCache(id);
if (flavor == null) {
flavor = flavors.get(id);
}
addToCache(flavor);
return flavor;
}
}
The MIT License (MIT)
当做虚拟机备份快照时,镜像的大小一般会大于实际数据大小,you'll need to zero out all free space of the partitions contained within the guest first.
参考维基百科:https://pve.proxmox.com/wiki/Shrin_Qcow2_Disk_Files
针对linux镜像:
尽量删除一些无用文件
dd if=/dev/zero of=/mytempfile
# that could take a some time
rm -f /mytempfile
mv image.qcow2 image.qcow2_backup
qemu-img convert -O qcow2 image.qcow2_backup image.qcow2
qemu-img convert -O qcow2 -c image.qcow2_backup image.qcow2
由于电信需要远程部署软件,拟使用puppet进行远程服务管理
在没有安装cloudinit情况下需要手动配置mtu大小为1454,否则无法上网,仅在使用neutron并且使用gre隧道情况下需要设置。
把一些常用的host放到C:/windows/system32/driver/etc/host,尤其是master的,务必设置!!
agent需要和master保持时间同步,因此需要设置windows更新时间服务器为master
需要安装win32-dir 版本大于0.43, 安装方法为管理员start command prompt with puppet, run:
gem install win32-dir
timezone默认输出中文,会出现编码错误,设置setcode 为英文字符,最好是世界时区标准格式
下载地址:官网
在这前清空puppet ssl目录,位于C:/programdata,并根据实际情况可以设置puppet.conf 。为了减少磁盘文件大小,最好运行下磁盘 清理,并删除掉一些无用文件。
把格式转化,一方面为了合并base image,另一方面也可以起到重新整理磁盘文件,减少文件大小,run:
qemu-img convert -O qcow2 origin.qcow2 new.qcow2
把镜像上传到glance下,记得设置os_type = windows否则,会出现RTC时间问题
启动云主机,记得设置os_type = windows, 原因同上!
上篇文章讲了nova-scheduler:openstack之nova-scheduler 。为了简单,只叙述了随机调度算法,而没有细讲filter调度算法。filter调度算法原理并不难,先层层过滤掉一些不满足条件的宿主机,然后对剩余的宿主机进行weight评分排序,多个weight得到的分数进行累加,分数较高的(注意不一定是最高,原因下面讲)作为侯选宿主机。具体算法描述可以查看官方文档:http://docs.openstack.org/trunk/config-reference/content/section_compute-scheduler.html 下面从源码逐步分析算法的运行过程。
首先看看schedule_run_instance方法:
def schedule_run_instance(self, context, request_spec,
admin_password, injected_files,
requested_networks, is_first_time,
filter_properties, legacy_bdm_in_spec):
"""This method is called from nova.compute.api to provision
an instance. We first create a build plan (a list of WeightedHosts)
and then provision.
Returns a list of the instances created.
"""
payload = dict(request_spec=request_spec)
self.notifier.info(context, 'scheduler.run_instance.start', payload)
instance_uuids = request_spec.get('instance_uuids') # 获取uuids,可有多个
LOG.info(_("Attempting to build %(num_instances)d instance(s) "
"uuids: %(instance_uuids)s"),
{'num_instances': len(instance_uuids),
'instance_uuids': instance_uuids})
LOG.debug(_("Request Spec: %s") % request_spec)
# 返回主机列表
weighed_hosts = self._schedule(context, request_spec,
filter_properties, instance_uuids)
# NOTE: Pop instance_uuids as individual creates do not need the
# set of uuids. Do not pop before here as the upper exception
# handler fo NoValidHost needs the uuid to set error state
instance_uuids = request_spec.pop('instance_uuids') # 弹出uuids,不再需要
# NOTE(comstud): Make sure we do not pass this through. It
# contains an instance of RpcContext that cannot be serialized.
filter_properties.pop('context', None)
for num, instance_uuid in enumerate(instance_uuids):
request_spec['instance_properties']['launch_index'] = num
try:
try:
weighed_host = weighed_hosts.pop(0) # 弹出第一个主机
LOG.info(_("Choosing host %(weighed_host)s "
"for instance %(instance_uuid)s"),
{'weighed_host': weighed_host,
'instance_uuid': instance_uuid})
except IndexError:
raise exception.NoValidHost(reason="")
self._provision_resource(context, weighed_host,
request_spec,
filter_properties,
requested_networks,
injected_files, admin_password,
is_first_time,
instance_uuid=instance_uuid,
legacy_bdm_in_spec=legacy_bdm_in_spec)
except Exception as ex:
# NOTE(vish): we don't reraise the exception here to make sure
# that all instances in the request get set to
# error properly
driver.handle_schedule_error(context, ex, instance_uuid,
request_spec)
# scrub retry host list in case we're scheduling multiple
# instances:
retry = filter_properties.get('retry', {})
retry['hosts'] = []
self.notifier.info(context, 'scheduler.run_instance.end', payload)
该方法在进行一些参数处理后,首先调用_schedule方法,该方法返回宿主机列表,然后对每个待启动云主机调用_provision_resource方法,并把对应的目标宿主机传入该方法。_provision_resource方法的任务是更新数据库和调用nova-compute的rpcapi指定目标宿主机启动云主机。核心方法是_schedule方法,
def _schedule(self, context, request_spec, filter_properties,
instance_uuids=None):
"""Returns a list of hosts that meet the required specs,
ordered by their fitness.
"""
elevated = context.elevated()
instance_properties = request_spec['instance_properties']
instance_type = request_spec.get("instance_type", None) # get flavor
# Get the group
update_group_hosts = False
scheduler_hints = filter_properties.get('scheduler_hints') or {}
group = scheduler_hints.get('group', None)
# --hint group SERVER_GROUP, 如果有group,则更新到数据库中
if group:
group_hosts = self.group_hosts(elevated, group)
update_group_hosts = True
if 'group_hosts' not in filter_properties:
filter_properties.update({'group_hosts': []})
configured_hosts = filter_properties['group_hosts']
filter_properties['group_hosts'] = configured_hosts + group_hosts
config_options = self._get_configuration_options()
# check retry policy. Rather ugly use of instance_uuids[0]...
# but if we've exceeded max retries... then we really only
# have a single instance.
properties = instance_properties.copy()
if instance_uuids:
properties['uuid'] = instance_uuids[0]
self._populate_retry(filter_properties, properties) # 如果超出最多尝试次数,抛出NoValidHost异常
filter_properties.update({'context': context,
'request_spec': request_spec,
'config_options': config_options,
'instance_type': instance_type})
self.populate_filter_properties(request_spec, # 把一些数据填入filter_properties中,比如project_id, os_type等
filter_properties)
# Find our local list of acceptable hosts by repeatedly
# filtering and weighing our options. Each time we choose a
# host, we virtually consume resources on it so subsequent
# selections can adjust accordingly.
# Note: remember, we are using an iterator here. So only
# traverse this list once. This can bite you if the hosts
# are being scanned in a filter or weighing function.
hosts = self.host_manager.get_all_host_states(elevated) # 获取所有主机列表,host_manager从父类init方法获取,根据CONF获取,默认为nova.scheduler.host_manager.HostManager,直接读取数据库
selected_hosts = []
if instance_uuids:
num_instances = len(instance_uuids)
else:
num_instances = request_spec.get('num_instances', 1)
# 注意range和xrange区别,range返回一个list,而xrange返回一个生成器
for num in xrange(num_instances):
# Filter local hosts based on requirements ...
hosts = self.host_manager.get_filtered_hosts(hosts,
filter_properties, index=num)
if not hosts:
# Can't get any more locally.
break
LOG.debug(_("Filtered %(hosts)s"), {'hosts': hosts})
weighed_hosts = self.host_manager.get_weighed_hosts(hosts, # 获取weight值,并按大到小排序
filter_properties)
LOG.debug(_("Weighed %(hosts)s"), {'hosts': weighed_hosts})
scheduler_host_subset_size = CONF.scheduler_host_subset_size # 截取集合到指定大小。
if scheduler_host_subset_size > len(weighed_hosts):
scheduler_host_subset_size = len(weighed_hosts)
if scheduler_host_subset_size < 1:
scheduler_host_subset_size = 1
chosen_host = random.choice(
weighed_hosts[0:scheduler_host_subset_size]) # 从截取的集合中随机选择一个作为目标宿主机,而不是一定是最大的。
selected_hosts.append(chosen_host)
# Now consume the resources so the filter/weights
# will change for the next instance.
chosen_host.obj.consume_from_instance(instance_properties) # 更新值,为下一个主机调度做准备
if update_group_hosts is True:
filter_properties['group_hosts'].append(chosen_host.obj.host)
return selected_hosts
该方法的两个核心方法是host_manager.get_filtered_hosts和host_manager.get_weighed_hosts方法,分别对应算法的过滤和计算权值两个过程。注意在计算权值后返回的是一个排好序的主机列表,但并不是选择其中一个最大值的作为目标宿主机,而是通过配置指定从topN中随机选择一个,比如设置scheduler_host_subset_size为5,过滤后返回的主机个数为10,则从top5中随机返回其中一个,这就是前面讲的为什么不是分值最高,而是较高。host_manager为可配置的,默nova.scheduler.host_manager.HostManager,
HostManagerd get_filtered_hosts主要调用两个方法:_choose_host_filters和filter_handler.get_filtered_objects,前者通过过滤器类名返回对应的类列表(相当于java中根据类名,比如"Apple",找到对应的类,比如a.b.Apple.class,或者getClass("Apple"),过滤器类名通过nova.conf的scheduler_default_filters配置,默认为RetryFilter','AvailabilityZoneFilter','RamFilter','ComputeFilter','ComputeCapabilitiesFilter','ImagePropertiesFilter'。然后把类列表传递给filter_handler.get_filtered_objects方法,filte_handle是filters.HostFilterHandler,而HostFilterHandler继承自nova.filters.BaseFilterHandler,其实现为:
class BaseFilterHandler(loadables.BaseLoader):
"""Base class to handle loading filter classes.
This class should be subclassed where one needs to use filters.
"""
def get_filtered_objects(self, filter_classes, objs,
filter_properties, index=0):
list_objs = list(objs)
LOG.debug(_("Starting with %d host(s)"), len(list_objs))
for filter_cls in filter_classes:
cls_name = filter_cls.__name__
filter = filter_cls()
if filter.run_filter_for_index(index):
objs = filter.filter_all(list_objs,
filter_properties)
if objs is None:
LOG.debug(_("Filter %(cls_name)s says to stop filtering"),
{'cls_name': cls_name})
return
list_objs = list(objs)
LOG.debug(_("Filter %(cls_name)s returned "
"%(obj_len)d host(s)"),
{'cls_name': cls_name, 'obj_len': len(list_objs)})
if len(list_objs) == 0:
break
return list_objs
可见他会遍历所有的过滤类,实例化,并且调用它的filter_all方法,最后返回所有未被拦截的对象。下面我们看看过滤类:
我们上编文章说了,过滤器是可插除的,如果我们要自定义自己的过滤器只需要继承BaseHostFilter(在nova.schedule.filters.__init__.py中定义)并实现host_passes方法即可,如下代码:
class BaseHostFilter(filters.BaseFilter):
"""Base class for host filters."""
def _filter_one(self, obj, filter_properties):
"""Return True if the object passes the filter, otherwise False."""
return self.host_passes(obj, filter_properties)
def host_passes(self, host_state, filter_properties):
"""Return True if the HostState passes the filter, otherwise False.
Override this in a subclass.
"""
raise NotImplementedError()
可见BaseHostFilter继承filters.BaseFilter,代码:
class BaseFilter(object):
"""Base class for all filter classes."""
def _filter_one(self, obj, filter_properties):
"""Return True if it passes the filter, False otherwise.
Override this in a subclass.
"""
return True
def filter_all(self, filter_obj_list, filter_properties):
"""Yield objects that pass the filter.
Can be overriden in a subclass, if you need to base filtering
decisions on all objects. Otherwise, one can just override
_filter_one() to filter a single object.
"""
for obj in filter_obj_list:
if self._filter_one(obj, filter_properties):
yield obj
# Set to true in a subclass if a filter only needs to be run once
# for each request rather than for each instance
run_filter_once_per_request = False
def run_filter_for_index(self, index):
"""Return True if the filter needs to be run for the "index-th"
instance in a request. Only need to override this if a filter
needs anything other than "first only" or "all" behaviour.
"""
if self.run_filter_once_per_request and index > 0:
return False
else:
return True
我们只需要关注两个方法_filter_one和filter_all,_filter_one传入过滤对象和过滤参数,返回bool类型,通过返回True,拦截返回False,而filer_all是传入一个对象集合,通过调用_filter_one产生一个通过过滤器的元素生成器。因此我们只需要重写_filter_one即可,而BaseHostFilter的_filter_one调用host_passes,因此只需要重写host_passes方法。
filterHandle正是调用了filter类的filter_all方法。
filter过程到此结束,下面看看weight过程,回到_schedule方法,调用了host_manager.get_weighed_hosts,而host_manager调用了weight_handler.get_weighed_objects方法,weight_handle是HostWeightHandler实例,该类继承自nova.weights.BaseWeightHandler,其实现为:
class BaseWeightHandler(loadables.BaseLoader):
object_class = WeighedObject
def get_weighed_objects(self, weigher_classes, obj_list,
weighing_properties):
"""Return a sorted (highest score first) list of WeighedObjects."""
if not obj_list:
return []
weighed_objs = [self.object_class(obj, 0.0) for obj in obj_list]
for weigher_cls in weigher_classes:
weigher = weigher_cls()
weigher.weigh_objects(weighed_objs, weighing_properties)
return sorted(weighed_objs, key=lambda x: x.weight, reverse=True)
和过滤过程类似,也是遍历所有的weighed类,调用它的weigh_objects方法,得到一个weight值,再和之前的值累加。weight_objects方法会调用_weight_object和_weight_multiplier方法,前者对应分值,后者对应权值,二者的乘积就是最后的分值。因此weighed类必须实现_weigh_objects和_weight_multiplier方法,最后再通过weight值排序返回。如果要自定义weight类,只需继承BaseHostWeigher,重写 _weigh_object和_weight_multiplier方法,得到的值就是他们的乘积。
nova-scheduler的功能是负责从多宿主机中调度最适合的宿主机生成云主机。即传入需要启动的云主机列表,nova-scheduler根据云主机的数量、参数等进行调度,选择合适的物理机(hypervisor,宿主机,即运行nova-compute的节点)启动这些云主机。在H版本中实现的调度算法有两个,即过滤(filter)调度算法和随机调度算法(chance)。目前的默认调度算法是 filter-scheduler,即过滤调度器,其思想是先进行一些条件过滤一些宿主机,比如要求可用内存大于2GB,小于2GB的直接过滤,过滤器可以串联多个,即层层过滤。然后对过滤后的宿主机进行权值计算,权值计算就是根据宿主机的状态进行评分(weight),最后根据评分(weight)排序,评分最高的为最佳候选宿主机,评分也是可以串联的,即层层评分。注意openstack的设计原则是可扩展,意味着调度算法、过滤函数、评分函数都是可插除的,用户可以自定义自己的调度器,过滤器,评分方法,而只需在配置文件中配置即可,无需修改核心代码。实现的过滤器很多,而评分函数目前只有内存评分,即根据内存使用量进行评分。
注意:启动云主机时宿主机的位置并不是完全由scheduler控制,用户可以指定availability-zone,aggregate以及通过设置--hint来控制宿主机在某个集合中(本质还是过滤,即通过用户设定的条件进行过滤)。
下面从入口manager开始看看nova-scheduler如何工作的,部分代码:
def run_instance(self, context, request_spec, admin_password,
injected_files, requested_networks, is_first_time,
filter_properties, legacy_bdm_in_spec=True):
"""Tries to call schedule_run_instance on the driver.
Sets instance vm_state to ERROR on exceptions
"""
instance_uuids = request_spec['instance_uuids']
with compute_utils.EventReporter(context, conductor_api.LocalAPI(),
'schedule', *instance_uuids):
try:
return self.driver.schedule_run_instance(context,
request_spec, admin_password, injected_files,
requested_networks, is_first_time, filter_properties,
legacy_bdm_in_spec)
except exception.NoValidHost as ex:
# don't re-raise
self._set_vm_state_and_notify('run_instance',
{'vm_state': vm_states.ERROR,
'task_state': None},
context, ex, request_spec)
except Exception as ex:
with excutils.save_and_reraise_exception():
self._set_vm_state_and_notify('run_instance',
{'vm_state': vm_states.ERROR,
'task_state': None},
context, ex, request_spec)
方法先获取需要创建云主机的uuid,然后直接调用driver的schedule_run_instance,这个driver即调度器,所有的调度器必须继承自driver.Schduler, 并且实现三个抽象方法:schedule_run_instance,select_destinations,select_hosts。driver是由配置文件配置,默认为nova.scheduler.filter_scheduler.FilterScheduler,如下:
scheduler_driver_opt = cfg.StrOpt('scheduler_driver',
default='nova.scheduler.filter_scheduler.FilterScheduler',
help='Default driver to use for the scheduler')
CONF = cfg.CONF
CONF.register_opt(scheduler_driver_opt)
class SchedulerManager(manager.Manager):
"""Chooses a host to run instances on."""
def __init__(self, scheduler_driver=None, *args, **kwargs):
if not scheduler_driver:
scheduler_driver = CONF.scheduler_driver
self.driver = importutils.import_object(scheduler_driver)
self.compute_rpcapi = compute_rpcapi.ComputeAPI()
super(SchedulerManager, self).__init__(service_name='scheduler',
*args, **kwargs)
### 省略其他代码
由此可见入口函数直接调用driver(调度器)的scheduler_run_instance方法,为了简单起见,下面以chance调度器为例,看看它如何工作。首先查看chance调度器的scheduler_run_instance代码:
def schedule_run_instance(self, context, request_spec,
admin_password, injected_files,
requested_networks, is_first_time,
filter_properties, legacy_bdm_in_spec):
"""Create and run an instance or instances."""
instance_uuids = request_spec.get('instance_uuids')
for num, instance_uuid in enumerate(instance_uuids):
request_spec['instance_properties']['launch_index'] = num
try:
host = self._schedule(context, CONF.compute_topic,
request_spec, filter_properties)
updated_instance = driver.instance_update_db(context,
instance_uuid)
self.compute_rpcapi.run_instance(context,
instance=updated_instance, host=host,
requested_networks=requested_networks,
injected_files=injected_files,
admin_password=admin_password,
is_first_time=is_first_time,
request_spec=request_spec,
filter_properties=filter_properties,
legacy_bdm_in_spec=legacy_bdm_in_spec)
except Exception as ex:
# NOTE(vish): we don't reraise the exception here to make sure
# that all instances in the request get set to
# error properly
driver.handle_schedule_error(context, ex, instance_uuid,
request_spec)
该方法首先获取所有需要启动的云主机列表uuid,对每个待启动云主机调用_schedule方法,该方法返回一个宿主机,更新数据库,然后调用nova-compute的远程方法api(rpcapi)调用run_instance方法,在选择的宿主机中启动该云主机,nova-scheduler任务完成。下面看看_schedule实现:
def _schedule(self, context, topic, request_spec, filter_properties):
"""Picks a host that is up at random."""
elevated = context.elevated()
hosts = self.hosts_up(elevated, topic) # 父类Schduler方法,返回所有nova-compute状态为up的主机列表。
if not hosts:
msg = _("Is the appropriate service running?")
raise exception.NoValidHost(reason=msg)
hosts = self._filter_hosts(request_spec, hosts, filter_properties) # 过滤一些主机黑名单列表。
if not hosts:
msg = _("Could not find another compute")
raise exception.NoValidHost(reason=msg)
return random.choice(hosts) # 随机返回其中一个主机
该方法非常简单,首先获取所有服务状态为up的宿主机列表,然后过滤一些黑名单,最后调用random.choice方法随机从中返回一个宿主机。
文中只对简单chance算法进行了简单叙述,而filter算法由于比较复杂,后面以另一篇文章进行叙述。
要想知道nova的工作过程,首先就要掌握它的入口,即novaclient!命令nova和horizon都调用了novaclient。
github地址:https://github.com/openstack/python-novaclient
novaclient的功能很简单,即解析参数,构造url并发送请求,处理结果。比如运行nova --debug list,首先需要解析出选项参数--debug,另外还要获取环境变量参数和默认参数,然后解析子命令list,通过子命令获取相对应的回调函数,list对应为novaclient.v1_1.shell.do_list。
下面详细看看它的工作原理,首先看看命令nova到底是什么?
which nova | xargs -I{} file {}
# 返回/usr/bin/nova: a /usr/bin/python script, ASCII text executable
可见命令nova只是一个python程序,让我们打开它
#!/usr/bin/python
# PBR Generated from 'console_scripts'
import sys
from novaclient.shell import main
if __name__ == "__main__":
sys.exit(main())
命令nova调用了novaclient.shell的main函数,从这里开始进入了novaclient,现在让我们开始novaclient吧!
首先看看novaclient.shell的main函数:
def main():
"""入口函数"""
try:
OpenStackComputeShell().main(map(strutils.safe_decode, sys.argv[1:]))
except Exception as e:
logger.debug(e, exc_info=1)
print("ERROR: %s" % strutils.safe_encode(six.text_type(e)),
file=sys.stderr)
sys.exit(1)
发现它又调用了OpenstackComputeShell()的main函数。这个main函数才是真正的入口函数,以下是前半部分代码:
def main(self, argv):
# Parse args once to find version and debug settings
parser = self.get_base_parser() # 添加选项,比如--user, --password等
(options, args) = parser.parse_known_args(argv)
self.setup_debugging(options.debug) # 如果options中有--debug,则设置logger的level为DEBUG,并输出到标准输出流
# Discover available auth plugins
novaclient.auth_plugin.discover_auth_systems()
# build available subcommands based on version
self.extensions = self._discover_extensions(
options.os_compute_api_version)
self._run_extension_hooks('__pre_parse_args__')
# NOTE(dtroyer): Hackery to handle --endpoint_type due to argparse
# thinking usage-list --end is ambiguous; but it
# works fine with only --endpoint-type present
# Go figure.
if '--endpoint_type' in argv:
spot = argv.index('--endpoint_type')
argv[spot] = '--endpoint-type'
# 根据版本解析子命令
subcommand_parser = self.get_subcommand_parser(
options.os_compute_api_version)
self.parser = subcommand_parser
# 如果--help,则打印help信息,并退出
if options.help or not argv:
subcommand_parser.print_help()
return 0
args = subcommand_parser.parse_args(argv) #解析命令行参数 argv=['list']
#print("args = %s" % args)
self._run_extension_hooks('__post_parse_args__', args)
# Short-circuit and deal with help right away.
# nova help xxxx 命令
if args.func == self.do_help:
self.do_help(args)
return 0
# nova bash-completion
elif args.func == self.do_bash_completion:
self.do_bash_completion(args)
return 0
parser是 NovaClientArgumentParser类型,该类型继承自argparse.ArgumentParser,argparse是python中的参数解析库。
get_base_parser方法即添加选项参数,诸如--debug, --timing,--os-username 等等,并会读取环境变量和设置默认值,下面是部分代码:
# Global arguments
parser.add_argument('-h', '--help',
action='store_true',
help=argparse.SUPPRESS,
)
parser.add_argument('--version',
action='version',
version=novaclient.__version__)
parser.add_argument('--debug',
default=False,
action='store_true',
help="Print debugging output")
parser.add_argument('--no-cache',
default=not utils.bool_from_str(
utils.env('OS_NO_CACHE', default='true')),
action='store_false',
dest='os_cache',
help=argparse.SUPPRESS)
parser.add_argument('--no_cache',
action='store_false',
dest='os_cache',
help=argparse.SUPPRESS)
parser.add_argument('--os-cache',
default=utils.env('OS_CACHE', default=False),
action='store_true',
help="Use the auth token cache.")
parser.add_argument('--timings',
default=False,
action='store_true',
help="Print call timing info")
parser.add_argument('--timeout',
default=600,
metavar='<seconds>',
type=positive_non_zero_float,
help="Set HTTP call timeout (in seconds)")
用过argparse库的一定不会陌生了。
回到main函数,接下来会设置debug,即如果有--debug选项,则设置logger的level为DEBUG并传入到标准输出流。
(options, args) = parser.parse_known_args(argv)返回解析结果,即options保存所有的选项参数,args保存位置参数,比如nova --debug list, options.debug等于True,args为['list']。
下一个函数get_subcommand_parser是一个核心方法,用于处理子命令比如list, flavor-list, boot等,以下是代码:
def get_subcommand_parser(self, version):
parser = self.get_base_parser()
self.subcommands = {}
subparsers = parser.add_subparsers(metavar='<subcommand>')
try:
actions_module = {
'1.1': shell_v1_1,
'2': shell_v1_1,
'3': shell_v3,
}[version]
except KeyError:
actions_module = shell_v1_1 #默认是1.1版本
self._find_actions(subparsers, actions_module)
self._find_actions(subparsers, self)
for extension in self.extensions:
self._find_actions(subparsers, extension.module)
self._add_bash_completion_subparser(subparsers)
return parser
这个方法是根据版本(默认是1.1)寻找可用的方法,我们假设使用shell_v1_1模块,它导入自from novaclient.v1_1 import shell as shell_v1_1,然后调用_find_actions方法。注意:这个方法传入的是一个模块,python中所有东西都是对象,模块也不例外,不过这里我们姑且认为它传入了一个类,类似与java的XXXClass.class类型,以下是代码:
def _find_actions(self, subparsers, actions_module):
# actions_module = shell_v1.1
for attr in (a for a in dir(actions_module) if a.startswith('do_')): # attr = do_flavor_list
# I prefer to be hypen-separated instead of underscores.
command = attr[3:].replace('_', '-') # do_flavor_list -> flavor-list
callback = getattr(actions_module, attr)
desc = callback.__doc__ or ''
action_help = desc.strip()
arguments = getattr(callback, 'arguments', [])
subparser = subparsers.add_parser(command,
help=action_help,
description=desc,
add_help=False,
formatter_class=OpenStackHelpFormatter
)
subparser.add_argument('-h', '--help',
action='help',
help=argparse.SUPPRESS,
)
self.subcommands[command] = subparser
for (args, kwargs) in arguments:
subparser.add_argument(*args, **kwargs)
subparser.set_defaults(func=callback)
可见这个方法是利用反射机制获取所有以do_开头的方法,这个do_XXX_XXX,XXX-XXX就是命令名,而do_XXX_XXX就是回调函数,把函数作为变量赋值给callback,是函数式编程的经典用法。最后把callback传入set_defaults方法。
至此我们知道nova list其实调用了novaclient.v1_1.shell.do_list()方法,而nova flavor-list调用了novaclient.v1_1.shell.do_flavor_list()方法,下面以nova --debug flavor-list为例继续深入。
我们看novaclient.v1_1.shell源码,发现好多do_XXX方法,但它本身并不做什么工作,而是调用cs去做,cs是什么现在不管。下面是do_flavor_list方法:
def do_flavor_list(cs, args):
"""Print a list of available 'flavors' (sizes of servers)."""
if args.all:
flavors = cs.flavors.list(is_public=None)
else:
flavors = cs.flavors.list()
_print_flavor_list(flavors, args.extra_specs)
现在我们不知道cs是什么东西,那我们继续回到main函数,main函数中间其余代码均是在各种参数检查,我们忽略不管,直接跳到main函数结尾
def main(self, argv):
# Parse args once to find version and debug settings
parser = self.get_base_parser() # 添加选项,比如--user, --password等
(options, args) = parser.parse_known_args(argv)
self.setup_debugging(options.debug) # 如果options中有--debug,则设置logger的level为DEBUG,并输出到标准输出流
# Discover available auth plugins
novaclient.auth_plugin.discover_auth_systems()
# build available subcommands based on version
self.extensions = self._discover_extensions(
options.os_compute_api_version)
self._run_extension_hooks('__pre_parse_args__')
# NOTE(dtroyer): Hackery to handle --endpoint_type due to argparse
# thinking usage-list --end is ambiguous; but it
# works fine with only --endpoint-type present
# Go figure.
if '--endpoint_type' in argv:
spot = argv.index('--endpoint_type')
argv[spot] = '--endpoint-type'
# 根据版本解析子命令
subcommand_parser = self.get_subcommand_parser(
options.os_compute_api_version)
self.parser = subcommand_parser
# 如果--help,则打印help信息,并退出
if options.help or not argv:
subcommand_parser.print_help()
return 0
args = subcommand_parser.parse_args(argv) #解析命令行参数 argv=['list']
#print("args = %s" % args)
self._run_extension_hooks('__post_parse_args__', args)
# Short-circuit and deal with help right away.
# nova help xxxx 命令
if args.func == self.do_help:
self.do_help(args)
return 0
# nova bash-completion
elif args.func == self.do_bash_completion:
self.do_bash_completion(args)
return 0
# 这里省略大量代码
self.cs = client.Client(options.os_compute_api_version, os_username,
os_password, os_tenant_name, tenant_id=os_tenant_id,
auth_url=os_auth_url, insecure=insecure,
region_name=os_region_name, endpoint_type=endpoint_type,
extensions=self.extensions, service_type=service_type,
service_name=service_name, auth_system=os_auth_system,
auth_plugin=auth_plugin,
volume_service_name=volume_service_name,
timings=args.timings, bypass_url=bypass_url,
os_cache=os_cache, http_log_debug=options.debug,
cacert=cacert, timeout=timeout)
# 这里省略大量代码
args.func(self.cs, args) # 此时func等于do_flavor_list
if args.timings: #如果有--timing选项,则打印请求时间
self._dump_timings(self.cs.get_timings())
可见cs是调用client.Client方法返回的,我们查看其代码client.py:
def get_client_class(version):
version_map = {
'1.1': 'novaclient.v1_1.client.Client',
'2': 'novaclient.v1_1.client.Client',
'3': 'novaclient.v3.client.Client',
}
try:
client_path = version_map[str(version)]
except (KeyError, ValueError):
msg = "Invalid client version '%s'. must be one of: %s" % (
(version, ', '.join(version_map.keys())))
raise exceptions.UnsupportedVersion(msg)
return utils.import_class(client_path)
def Client(version, *args, **kwargs):
client_class = get_client_class(version)
return client_class(*args, **kwargs)
不难看出cs即根据版本选择的Client类型,这里我们用的是novaclient.v1_1.client.Client。这个模块可以认为是功能模块的注册类,比如flavors操作模块为flavors.py,为了让他生效,必须注册,即在Client中设置self.flavors=flavors.FlavorManager(self):
self.projectid = project_id
self.tenant_id = tenant_id
self.flavors = flavors.FlavorManager(self)
self.flavor_access = flavor_access.FlavorAccessManager(self)
self.images = images.ImageManager(self)
self.limits = limits.LimitsManager(self)
self.servers = servers.ServerManager(self)
从do_flavor_list方法中cs.flavor.list()即调用了flavors.FlavorManager().list方法。从这里我们可以看出openstack的设计原则,即支持自由灵活的可扩展性,如果需要添加新功能,几乎不需要修改太多代码,只要修改Client注册即可。
我们查看flavors.py中的list方法:
def list(self, detailed=True, is_public=True):
"""
Get a list of all flavors.
:rtype: list of :class:`Flavor`.
"""
qparams = {}
# is_public is ternary - None means give all flavors.
# By default Nova assumes True and gives admins public flavors
# and flavors from their own projects only.
if not is_public:
qparams['is_public'] = is_public
query_string = "?%s" % urlutils.urlencode(qparams) if qparams else ""
detail = ""
if detailed:
detail = "/detail"
return self._list("/flavors%s%s" % (detail, query_string), "flavors")
很明显这里是把list命令组装url请求,然后调用_list方法,由于FlavorManager继承自base.ManagerWithFind,而base.ManagerWithFind继承自Manager,_list方法在Manager中定义。
def _list(self, url, response_key, obj_class=None, body=None):
if body:
_resp, body = self.api.client.post(url, body=body)
else:
_resp, body = self.api.client.get(url)
if obj_class is None:
obj_class = self.resource_class
data = body[response_key]
# NOTE(ja): keystone returns values as list as {'values': [ ... ]}
# unlike other services which just return the list...
if isinstance(data, dict):
try:
data = data['values']
except KeyError:
pass
with self.completion_cache('human_id', obj_class, mode="w"):
with self.completion_cache('uuid', obj_class, mode="w"):
return [obj_class(self, res, loaded=True)
for res in data if res]
有源码中看出主要发送url请求即self.api.client.post(url, body=dody)或者self.api.client.get(url),具体根据是否有body,即是否数据选择GET或者POST请求。然后处理返回的数据。self.api在这里其实就是novaclient.v1_1.client.Client,只是前面用cs,这里用api。
我们回到novaclient.v1_1.client.Client的方法中,我们发现除了注册一系列功能外,还有一个比较特殊的,
self.client = client.HTTPClient(username,
password,
projectid=project_id,
tenant_id=tenant_id,
auth_url=auth_url,
insecure=insecure,
timeout=timeout,
auth_system=auth_system,
auth_plugin=auth_plugin,
proxy_token=proxy_token,
proxy_tenant_id=proxy_tenant_id,
region_name=region_name,
endpoint_type=endpoint_type,
service_type=service_type,
service_name=service_name,
volume_service_name=volume_service_name,
timings=timings,
bypass_url=bypass_url,
os_cache=self.os_cache,
http_log_debug=http_log_debug,
cacert=cacert)
这个self.client是client.HTTPClient类型,真正负责发送url请求的类,部分代码为:
def _cs_request(self, url, method, **kwargs):
if not self.management_url:
self.authenticate()
# Perform the request once. If we get a 401 back then it
# might be because the auth token expired, so try to
# re-authenticate and try again. If it still fails, bail.
try:
kwargs.setdefault('headers', {})['X-Auth-Token'] = self.auth_token
if self.projectid:
kwargs['headers']['X-Auth-Project-Id'] = self.projectid
resp, body = self._time_request(self.management_url + url, method,
**kwargs)
return resp, body
except exceptions.Unauthorized as e:
try:
# frist discard auth token, to avoid the possibly expired
# token being re-used in the re-authentication attempt
self.unauthenticate()
self.authenticate()
kwargs['headers']['X-Auth-Token'] = self.auth_token
resp, body = self._time_request(self.management_url + url,
method, **kwargs)
return resp, body
except exceptions.Unauthorized:
raise e
def get(self, url, **kwargs):
return self._cs_request(url, 'GET', **kwargs)
最后会调用http.request方法发送请求,这里使用了python库Requests: HTTP for Humans,这个库比httplib2更好,查看地址:http://docs.python-requests.org/en/latest/。接收请求的工作就由nova-api负责了,这里不再深入。
接下来我们简单增加一个没用的功能test,首先在novaclient/v1_1下touch test.py,使用vim增加以下代码:
"""
Test interface.
"""
from novaclient import base
class Test(base.Resource):
def test(self):
print("This is a test")
class TestManager(base.Manager):
def test(self):
print("This is a test")
然后我们需要在client中注册,编辑novaclient/v1_1/client.py文件,增加self.test = test.TestManager(self)
然后在shell下增加入口函数,注册新功能,
def do_test(cs, _args): """ do test. """ cs.test.test()
运行nova test, nova help test查看效果。
openstack热迁移有多种方式,下面配置block-migration方式,即以拷贝的方式实现热迁移,这种方式配置简单,但需要从源计算节点拷贝文件到目标计算节点,会比较耗时。而使用共享存储的方式,把所有的/var/lib/nova/instances下的虚拟机信息放到一块,比如放到控制节点,所有的计算使用nfs节点挂载该目录。这种方式迁移时不需要拷贝虚拟机文件,但所有的计算机节点的文件集中一起,管理不方便,配置方法见官方文档。
下面是block-migration方式配置:
首先在所有的计算节点中的nova.conf中增加live_migration_flag=VIR_MIGRATE_UNDEFINE_SOURCE,VIR_MIGRATE_PEER2PEER,VIR_MIGRATE_LIVE,VIR_MIGRATE_UNSAFE
开启热迁移功能。
然后需要配置versh免密码连接,修改/etc/libvirt/libvirtd.conf
下载官方原版iso镜像:http://pan.baidu.com/s/1c0owVc8
下载virtio驱动:http://alt.fedoraproject.org/pub/alt/virtio-win/archives/virtio-win-0.1-59/virtio-win-0.1-59.iso
创建一个虚拟qcow2盘:
kvm-img create -f qcow2 xp.qcow2 10G
安装xp到创建的虚拟盘中,有些教程说要加载软驱 virtio-win-xx.vfd和virtio驱动,实际上xp不需要,稍后我们再装!
kvm -m 1024 -cdrom xp.iso -drive file=xp.qcow2 -fda -boot d
安装好系统后,我们进入系统,并且安装virtio驱动,
kvm -hda xp.qcow2 \
-drive file=xp.qcow2,if=virtio \
-drive file=virtio-win-0.1-30.iso,media=cdrom,index=1 \
-net nic,model=virtio \
-net user \
-boot d \
-vga std \
-m 1024
进入xp系统,点击我的电脑->管理->设备管理,更新scsi和网卡驱动,注意scsi驱动必须安装,否则进入后会出现蓝屏。
驱动安装后,就可以上传到openstack中了
glance image-create --name xp --container-format=ovf --disk-format=qcow2 -file xp.qcow2 --progress
创建虚拟机时,根磁盘要大于创建虚拟磁盘的大小,临时磁盘对应一块新的未格式化的虚拟硬盘,swap不需要,创建成功后,进入系统。
此时c盘大小和创建虚拟盘大小一样,如果分配的磁盘大于虚拟盘大小,比如我们虚拟盘大小的10G,创建云主机时指定根磁盘大小为20G,此时需要使用磁盘扩展工具扩展c盘大小,http://pan.baidu.com/s/1eQh7q9c,如果有临时磁盘,则需要使用磁盘管理工具初始化磁盘,即我的电脑->管理->磁盘管理,然后格式化。
挂载新的云硬盘如果未被初始化,也需要进行磁盘初始化和格式化。
如果我们需要远程登陆,则还需要开启远程桌面功能,我一般还会关掉防火墙。
这是我制作的镜像,可以直接使用:http://pan.baidu.com/s/1pJEyVGZ