Skip to content
Hardy edited this page Aug 19, 2016 · 1 revision

Welcome to the RDS wiki!

openstack测试环境: 网络配置信息:

业务vlan 范围: 700-740 与GIC联调使用的vlan范围:1010-1100 交换机openstack 使用端口 port 39-47 700-740, 1010-1100

其他vlan vlan 720 做vxlan 隧道 (172.16.5.0/24) evpn vlan 721 做管理网络 (172.16.6.0/24) mgt 网关地址172.16.6.254 vlan 723 floating ip (172.16.8.0/24 网关地址172.16.8.1) br-ex vlan 50 作为 带外地址,用于trove-agent连接 地址段(172.5.13.0/24) 使用的地址172.5.13.1 , 网关地址172.5.13.254 vlan 740 服务网络 在65上和vlan50打通 172.16.15.0/24 网关地址172.16.15.254 vlan 724 测试 service ,地址172.16.17.0/24

controller : ip: 172.5.13.1 网络接口 enp1s0f0 对应交换机 端口 port1 (外部网络) ip: xx.xx.xx.xx 网络接口 enp1s0f1 对应交换机 端口 port2 ip: 172.16.6.1 网络接口 enp1s0f1.721 对应vlan 721 (管理网络) ip: 172.16.5.1 网络接口 enp1s0f1.720 对应vlan 720 (隧道地址)

compute01: ip: 172.5.13.2 网络接口 em1 对应交换机 端口 port3 (管理网) ip: xx.xx.xx.xx 网络接口 em2 对应交换机 端口 port4 ip: 172.16.6.2 网络接口 em2.721 对应vlan 721 (管理网络) ip: 172.16.5.2 网络接口 em2.720 对应vlan 720 (隧道地址)

compute02: ip: 172.5.13.3 网络接口 em1 对应交换机 端口 port5(管理网) ip: xx.xx.xx.xx 网络接口 em2 对应交换机 端口 port6 ip: 172.16.6.3 网络接口 em2.721 对应vlan 721 (管理网络) ip: 172.16.5.3 网络接口 em2.720 对应vlan 720 (隧道地址)

vxlan增加50字节的开销,注意调整设备接口下的MTU值

使用diskimage制作镜像脚本

#/usr/bin/sh set -xe mkdir -p /trove;cd trove git clone https://github.com/openstack/tripleo-image-elements.git git clone https://github.com/vkmc/trove-image-elements.git pip install https://github.com/openstack/diskimage-builder.git

diskimage_elements_path=/usr/share/diskimage-builder/elements if [ -e $diskimage_elements_path -a -d $diskimage_elements_path];then echo "$diskimage_elements_path is exist" else git clone https://github.com/openstack/diskimage-builder.git mkdir -p /usr/share/diskimage-builder cp -rfv diskimage-builder/{elements,lib} /usr/share/diskimage-builder fi cd

设置环境变量 #基于centos做镜像 ,附带trove安装包 export DISTRO=centos export DIB_RELEASE=GenericCloud export DIB_EXTLINUX=0 export DIB_DEBUG_TRACE=1

#export DIB_DEV_USER_USERNAME=trove #export DIB_DEV_USER_PASSWORD=123456 #export DIB_DEV_USER_SHELL=/bin/bash #export DIB_DEV_USER_PWDLESS_SUDO=true #export DIB_DEV_USER_AUTHORIZED_KEYS=/home/trove/.ssh/authorized_keys

export DIB_CLOUD_INIT_DATASOURCES="OpenStack,EC2" export DIB_REPO_PATH=/usr/share/diskimage-builder export ELEMENTS_PATH=/trove/tripleo-image-elements/elements:/trove/trove-image-elements/elements:/usr/share/diskimage-builder/elements

export DIB_TROVE_RABBITMQ_HOSTS=172.5.13.1:5672 export DIB_TROVE_RABBIT_USERID=guest export DIB_TROVE_RABBIT_PASSWORD=guest export DIB_TROVE_RABBIT_USE_SSL=false export DIB_TROVE_AUTH_URL=https://172.5.13.1:35357/v2.0 export DIB_TROVE_NOVA_PROXY_ADMIN_USER=nova export DIB_TROVE_NOVA_PROXY_ADMIN_PASS=123456 export DIB_TROVE_NOVA_PROXY_ADMIN_TENANT_NAME=service export DIB_TROVE_SWIFT_URL=https://172.5.13.1:8080/v1/AUTH_ export DIB_TROVE_SWIFT_SERVICE_TYPE=object-store export DIB_TROVE_BACKUP_SWIFT_CONTAINER=database_backups

export DATASTORE="mysql" export DATASTORE_VERSION="5.6" export PACKAGES="mysql-community-server"

生成镜像 disk-image-create -a amd64 -o ${DISTRO}-${DATASTORE}-${DATASTORE_VERSION}-guest-image -x --qemu-img-options compat=0.10 ${DISTRO}-${DATASTORE}-guest-image

#基于ubuntu做镜像,只有镜像呦 export DISTRO=ubuntu export DIB_RELEASE=trusty disk-image-create -a amd64 -o ${DISTRO}-guest-image -x --qemu-img-options compat=0.10 grub2 vm ubuntu

#强制卸载 /tmp/xxxxaaa 目录 umount -fl /tmp/xxxxaaa/

对trove配置镜像指令: trove-manage datastore_update ${DATASTORE} "" trove-manage datastore_version_update ${DATASTORE} ${DATASTORE_VERSION} ${DATASTORE} ${IMAGE_ID} ${PACKAGES} 1 trove-manage datastore_update ${DATASTORE} ${DATASTORE_VERSION}

设置可配置参数(options) trove-manage db_load_datastore_config_parameters "mysql" "5.6" validation-rules.json 关于validation-rules.json文件见附件

上传镜像 glance image-create --name "cirros-in-fs" --file /tmp/images/cirros-0.3.4-x86_64-disk.img --disk-format qcow2 --container-format bare --visibility public --progress

挂载镜像 #guestmount -a centos-mysql-5.6-guest-image.qcow2 -m /dev/sda1 --rw /mnt/ #进入镜像,并修改 #chroot /mnt #修改完成以后ctrl+D,退出 #umount 镜像 #umount /mnt

#Diskimage build FAQ map_file : /usr/share/pkg-map/base ,就是base/pkg-map文件,文本内容为: { "family": { "redhat": { "iscsi_package": "iscsi-initiator-utils" }, "suse": { "dkms_package": "" }, "gentoo": { "dkms_package": "", "grub-pc": "grub", "extlinux": "syslinux" } }, "default": { "ccache_package": "ccache", "dkms_package": "dkms", "iscsi_package": "open-iscsi" } }

如果在安装过程发现包没有的话,则可以修改该文件添加包,比如grub没有,在default上添加, 不知道怎么写可以参考grub2/pkg-map文件 "default": { "ccache_package": "ccache", "dkms_package": "dkms", "iscsi_package": "open-iscsi", “grub-pc”: "grub-pc-bin" }

转换qcow2 qemu-img convert -c -f raw /tmp/image.9yAanZwY/image.raw -O qcow2 -o compat=0.10 ubuntu.qcow2-new

trove 服务启动:

systemctl restart openstack-trove-api openstack-trove-taskmanager openstack-trove-conductor systemctl status openstack-trove-api openstack-trove-taskmanager openstack-trove-conductor

#Trove FAQ: trove guest agent 启动逻辑如下: /trove/guestagent/datastore/mysql/service.py:598: def install_if_needed(self, packages) 这个manager.py-prepare方法就是创建 一个 RDS实例的所有逻辑

mysql 的my.cnf 配置文件是由trove提供的,虚拟机的flavo可以查看虚拟机的具体配置,eg:cpu?ram?disk?等等。在程序里找到相应的 配置模板,重写配置即可:模板文件见/usr/lib/python2.7/site-packages/trove/templates/mysql/config.template内容如下: 1:my.cnf X [client] port = 3306

[mysqld_safe] nice = 0

[mysqld] user = mysql port = 3306 basedir = /usr datadir = /var/lib/mysql ####tmpdir = /tmp tmpdir = /var/tmp pid_file = /var/run/mysqld/mysqld.pid skip-external-locking = 1 key_buffer_size = {{ (50 * flavor['ram']/512)|int }}M max_allowed_packet = {{ (1024 * flavor['ram']/512)|int }}K thread_stack = 192K thread_cache_size = {{ (4 * flavor['ram']/512)|int }} myisam-recover = BACKUP query_cache_type = 1 query_cache_limit = 1M query_cache_size = {{ (8 * flavor['ram']/512)|int }}M innodb_data_file_path = ibdata1:10M:autoextend innodb_buffer_pool_size = {{ (150 * flavor['ram']/512)|int }}M innodb_file_per_table = 1 innodb_log_files_in_group = 2 innodb_log_file_size=50M innodb_log_buffer_size=25M connect_timeout = 15 wait_timeout = 120 join_buffer_size = 1M read_buffer_size = 512K read_rnd_buffer_size = 512K sort_buffer_size = 1M tmp_table_size = {{ (16 * flavor['ram']/512)|int }}M max_heap_table_size = {{ (16 * flavor['ram']/512)|int }}M table_open_cache = {{ (256 * flavor['ram']/512)|int }} table_definition_cache = {{ (256 * flavor['ram']/512)|int }} open_files_limit = {{ (512 * flavor['ram']/512)|int }} max_user_connections = {{ (100 * flavor['ram']/512)|int }} max_connections = {{ (100 * flavor['ram']/512)|int }} default_storage_engine = innodb local-infile = 0 server_id = {{server_id}}

log_error = /var/lib/mysql/mysql-error.log slow_query_log = 1 slow_query_log_file = /var/lib/mysql/slow-query.log long_query_time = 2

[mysqldump] quick = 1 quote-names = 1 max_allowed_packet = 16M

[isamchk] key_buffer = 16M

!includedir /etc/mysql/conf.d/

说明: /home/max/Project/trove/trove/common/template.py 文件说明重写my.cnf的逻辑

trove 实例的状态: RUNNING = ServiceStatus(0x01, 'running', 'ACTIVE') BLOCKED = ServiceStatus(0x02, 'blocked', 'BLOCKED') PAUSED = ServiceStatus(0x03, 'paused', 'SHUTDOWN') SHUTDOWN = ServiceStatus(0x04, 'shutdown', 'SHUTDOWN') CRASHED = ServiceStatus(0x06, 'crashed', 'SHUTDOWN') FAILED = ServiceStatus(0x08, 'failed to spawn', 'FAILED') BUILDING = ServiceStatus(0x09, 'building', 'BUILD') PROMOTING = ServiceStatus(0x10, 'promoting replica', 'PROMOTE') EJECTING = ServiceStatus(0x11, 'ejecting replica source', 'EJECT') UNKNOWN = ServiceStatus(0x16, 'unknown', 'ERROR') NEW = ServiceStatus(0x17, 'new', 'NEW') DELETED = ServiceStatus(0x05, 'deleted', 'DELETED') FAILED_TIMEOUT_GUESTAGENT = ServiceStatus(0x18, 'guestagent error', 'ERROR') BUILD_PENDING = ServiceStatus(0x19, 'build pending', 'BUILD')

trove instance tasks status NONE = InstanceTask(0x01, 'NONE', 'No tasks for the instance.') DELETING = InstanceTask(0x02, 'DELETING', 'Deleting the instance.') REBOOTING = InstanceTask(0x03, 'REBOOTING', 'Rebooting the instance.') RESIZING = InstanceTask(0x04, 'RESIZING', 'Resizing the instance.') BUILDING = InstanceTask(0x05, 'BUILDING', 'The instance is building.') MIGRATING = InstanceTask(0x06, 'MIGRATING', 'Migrating the instance.') RESTART_REQUIRED = InstanceTask(0x07, 'RESTART_REQUIRED', 'Instance requires a restart.') PROMOTING = InstanceTask(0x08, 'PROMOTING','Promoting the instance to replica source.') EJECTING = InstanceTask(0x09, 'EJECTING','Ejecting the replica source.') LOGGING = InstanceTask(0x0a, 'LOGGING', 'Transferring guest logs.') BUILDING_ERROR_DNS = InstanceTask(0x50, 'BUILDING', 'Build error: DNS.',is_error=True) BUILDING_ERROR_SERVER = InstanceTask(0x51, 'BUILDING','Build error: Server.',is_error=True) BUILDING_ERROR_VOLUME = InstanceTask(0x52, 'BUILDING','Build error: Volume.',is_error=True) BUILDING_ERROR_TIMEOUT_GA = InstanceTask(0x54, 'ERROR','Build error: guestagent timeout.',is_error=True) BUILDING_ERROR_SEC_GROUP = InstanceTask(0x53, 'BUILDING','Build error: Secgroup or rule.',is_error=True) BUILDING_ERROR_REPLICA = InstanceTask(0x54, 'BUILDING','Build error: Replica.',is_error=True) PROMOTION_ERROR = InstanceTask(0x55, 'PROMOTING','Replica Promotion Error.',is_error=True) EJECTION_ERROR = InstanceTask(0x56, 'EJECTING','Replica Source Ejection Error.',is_error=True) GROWING_ERROR = InstanceTask(0x57, 'GROWING','Growing Cluster Error.',is_error=True) SHRINKING_ERROR = InstanceTask(0x58, 'SHRINKING','Shrinking Cluster Error.',is_error=True)

对trove错误实例删除不掉的处理

[root@controller ~]# mysql -uroot -p123456
Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 1945 Server version: 5.5.44-MariaDB MariaDB Server

Copyright (c) 2000, 2015, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use trove; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A

Database changed MariaDB [trove]> show tables; +--------------------------------------+ | Tables_in_trove | +--------------------------------------+ | agent_heartbeats | | backups | | capabilities | | capability_overrides | | clusters | | conductor_lastseen | | configuration_parameters | | configurations | | datastore_configuration_parameters | | datastore_versions | | datastores | | dns_records | | instances | | migrate_version | | quota_usages | | quotas | | reservations | | root_enabled_history | | security_group_instance_associations | | security_group_rules | | security_groups | | service_images | | service_statuses | | usage_events | +--------------------------------------+ 24 rows in set (0.00 sec)

MariaDB [trove]> MariaDB [trove]> update service_statuses set status_id=24, status_description='guestagent error' where instance_id='586f8c74f38'; Query OK, 1 row affected (0.01 sec) Rows matched: 1 Changed: 1 Warnings: 0

MariaDB [trove]>update instances set task_id=84 where id='6b42605d-aa26-4db7-ab9e-08e4c5740f31';

trove代码修改 1.boot from image create a new volume vim ./taskmanager/models.py +913

XXXXXXXXXXX: files: {'/etc/trove/conf.d/guest_info.conf': u'[DEFAULT]\nguest_id=260ad404-a89a-4b46-81a4-a8b6fc052e49\ndatastore_manager=mysql\ntenant_id=7e031147260646e58e9d4b138b6f9d95\n'} _create_server /usr/lib/python2.7/site-packages/trove/taskmanager/models.py:803

XXXXXXXXXXX: nics: [{u'net-id': u'2b1dd0cc-531e-469c-b997-2c54e8175160', u'v4-fixed-ip': u''}] _create_server /usr/lib/python2.7/site-packages/trove/taskmanager/models.py:804

flavor_id XXX 3 _create_server_volume_individually /usr/lib/python2.7/site-packages/trove/taskmanager/models.py:680 image_id XXX c303601d-a5fe-451d-8354-13a6c55ac4a6 _create_server_volume_individually /usr/lib/python2.7/site-packages/trove/taskmanager/models.py:681 block_device_mapping XXX {'vdb': u'c4f08ec5-e27b-4738-87a2-ad631c5fc65c::1:1'} _create_server_volume_individually /usr/lib/python2.7/site-packages/trove/taskmanager/models.py:682

inject_partition

        # File injection only if needed
    elif inject_files and CONF.libvirt.inject_partition != -2:
        if booted_from_volume:
            LOG.warn(_LW('File injection into a boot from volume '
                         'instance is not supported'), instance=instance)
        self._inject_data(
            instance, network_info, admin_pass, files, suffix)

trove security group: def _create_secgroup(self, datastore_manager): security_group = SecurityGroup.create_for_instance( self.id, self.context) tcp_ports = CONF.get(datastore_manager).tcp_ports udp_ports = CONF.get(datastore_manager).udp_ports self._create_rules(security_group, tcp_ports, 'tcp') self._create_rules(security_group, udp_ports, 'udp') return [security_group["name"]]

注入文件内容 files = {guest_info_file: ( "[DEFAULT]\n" "guest_id=%s\n" "datastore_manager=%s\n" "tenant_id=%s\n" % (self.id, datastore_manager, self.tenant_id))}

应用cloud-init在trove中,使用的文件如下: [root@controller ~(admin)]# cat /etc/trove/cloudinit/mysql.cloudinit #!/bin/sh echo '[DEFAULT] guest_id=%(guest_id)s tenant_id=%(tenant_id)s datastore_manager=%(datastore_manager)s' > /etc/trove/conf.d/guest_info.conf

镜像中cloud-init启动的服务,cloud-final为cloud-init最后一个启动的服务,trove的服务可以启动在后面 -rw-r--r--. 1 root root 400 4月 1 2014 cloud-config.service -rw-r--r--. 1 root root 489 4月 1 2014 cloud-config.target -rw-r--r--. 1 root root 405 4月 1 2014 cloud-final.service -rw-r--r--. 1 root root 325 4月 1 2014 cloud-init-local.service -rw-r--r--. 1 root root 416 4月 1 2014 cloud-init.servic

list 只能21条记录 SELECT instances.id AS instances_id, instances.created AS instances_created, instances.updated AS instances_updated, instances.name AS instances_name, instances.hostname AS instances_hostname, instances.compute_instance_id AS instances_compute_instance_id, instances.task_id AS instances_task_id, instances.task_description AS instances_task_description, instances.task_start_time AS instances_task_start_time, instances.volume_id AS instances_volume_id, instances.flavor_id AS instances_flavor_id, instances.volume_size AS instances_volume_size, instances.tenant_id AS instances_tenant_id, instances.server_status AS instances_server_status, instances.deleted AS instances_deleted, instances.deleted_at AS instances_deleted_at, instances.datastore_version_id AS instances_datastore_version_id, instances.configuration_id AS instances_configuration_id, instances.slave_of_id AS instances_slave_of_id, instances.cluster_id AS instances_cluster_id, instances.shard_id AS instances_shard_id, instances.type AS instances_type FROM instances WHERE instances.deleted = false AND instances.tenant_id = '736cb342d1724e29b973595a05008c68' AND instances.cluster_id IS NULL ORDER BY instances.id LIMIT 21

trove api 修改,增加security_groups参数,可以参考nics, 其中nic参数是个字段,格式如下: nics =[{'net-id':'xxxxx'}]

trove 创建实例报错因为做块设备映射问题解决: 在计算节点中的配置文件修改如下参数: block_device_allocate_retries = 300 block_device_allocate_retries_interval = 10 block_device_creation_timeout = 60

trove 做slave时候代码层面有bug: 场景: 一个RDS实例,有两张网络,分别是管理网络、租户网络

* 管理网络:对首都在线的devops开放ssh、icmp两种协议、提供运维
* 租户网络:客户通过自己的私有网络使用端口3306访问rds实例


 对应这样的场景,RDS实例出现一个问题,就是做HA的时候,在slavor上设置的master的地址是管理网络的地址,而不是租户的网络地址,管理网络不开放端口3306,导致的问题就是数据库无法做HA
 问题根源在这里:

class MysqlGTIDReplication(mysql_base.MysqlReplicationBase): """MySql Replication coordinated by GTIDs."""

def connect_to_master(self, service, snapshot): logging_config = snapshot['log_position'] LOG.debug("connect_to_master %s" % logging_config['replication_user']) change_master_cmd = ( "CHANGE MASTER TO MASTER_HOST='%(host)s', " "MASTER_PORT=%(port)s, " "MASTER_USER='%(user)s', " "MASTER_PASSWORD='%(password)s', " "MASTER_AUTO_POSITION=1 " % { 'host': snapshot['master']['host'], 'port': snapshot['master']['port'], 'user': logging_config['replication_user']['name'], 'password': logging_config['replication_user']['password'] }) service.execute_on_client(change_master_cmd) service.start_slave()

这块代码是用来组织SQL语句,配置slavor,其中用到了master的IP地址

class MysqlReplicationBase(base.Replication): """Base class for MySql Replication strategies."""

def get_master_ref(self, service, snapshot_info): master_ref = { 'host': netutils.get_my_ipv4(),
'port': service.get_port() } return master_ref

这块代码用了获取master的IP地址, 可以看到获取master IP地址是通过oslo_utils包里面的 netutils.get_my_ipv4()方法获取,而这个方法如下:def get_my_ipv4(): """Returns the actual ipv4 of the local machine.

This code figures out what source address would be used if some traffic were to be sent out to some well known address on the Internet. In this case, IP from RFC5737 is used, but the specific address does not matter much. No traffic is actually sent.

.. versionadded:: 1.1

.. versionchanged:: 1.2.1 Return '127.0.0.1' if there is no default interface. """ try: csock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) csock.connect(('192.0.2.0', 80)) (addr, port) = csock.getsockname() csock.close() return addr except socket.error: return _get_my_ipv4_address() 这块代码使用了私有地址 ‘192.0.2.0’,这个地址是新闻工作者测试地址。这个方法还有隐藏的要求,就是他返回的结果肯定是能够三层路由的,而且就是有网关的那个地址

解决方案如下: 自己实现一个获取本地IP地址方法,替换 netutils.get_my_ipv4()即可

trove backup 直接上代码:
def execute_backup(self, context, backup_info, runner=RUNNER, extra_opts=EXTRA_OPTS, incremental_runner=INCREMENTAL_RUNNER):

LOG.debug("Running backup %(id)s.", backup_info) storage = get_storage_strategy( CONF.storage_strategy, CONF.storage_namespace)(context)

Check if this is an incremental backup and grab the parent metadata

parent_metadata = {} if backup_info.get('parent'): runner = incremental_runner LOG.debug("Using incremental backup runner: %s.", runner.name) parent = backup_info['parent'] parent_metadata = storage.load_metadata(parent['location'], parent['checksum'])

The parent could be another incremental backup so we need to

reset the location and checksum to this parents info

parent_metadata.update({ 'parent_location': parent['location'], 'parent_checksum': parent['checksum'] })

self.stream_backup_to_storage(backup_info, runner, storage, parent_metadata, extra_opts)

cfg.StrOpt('backup_strategy', default='InnoBackupEx', cfg.StrOpt('backup_namespace',default='trove.guestagent.strategies.backup.mysql_impl', cfg.StrOpt('restore_namespace',default='trove.guestagent.strategies.restore.mysql_impl', cfg.StrOpt('storage_strategy', default='SwiftStorage', cfg.StrOpt('storage_namespace',default='trove.guestagent.strategies.storage.swift'

* 实例SwiftStorage
* 实例化swiftclient,即
*