Commit Graph

106 Commits

Author SHA1 Message Date
wanyaoqi
4484a96c8d feat(region,host): vgpu support, refector host registor (#17879)
* feat(region,host): refactor host register

* feat(region,host): vgpu support

add nvidia/amd vgpu support.
nvidia vgpu need manaul configure vgpu instance before add in host.conf

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>

---------

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>
2023-09-01 15:42:22 +08:00
wanyaoqi
de8d4e588c fix(host): host add option no_hpet
Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>
2023-06-12 17:22:46 +08:00
wanyaoqi
9d12a39598 feat(region,host): custom pci device type 2023-05-17 17:23:34 +08:00
wanyaoqi
cd41911925 feat(region,host): lvm storage backend
support use lvm volume group as storage backend.

host config example:
lvm_volume_groups:
- storage1
2023-04-24 17:55:44 +08:00
wanyaoqi
d394d9bb55 feat(region,host): nvme device passthrough 2023-04-24 17:55:39 +08:00
Jian Qiu
007ff435a1 fix: ovn options missing eip and brmapped (#16028)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2023-02-23 17:14:02 +08:00
Qiu Jian
9433fc14a8 fix: add disable_local_vpc option and auto config ovn for lbagent and
host
2023-02-22 14:35:43 +08:00
wanyaoqi
693257697e feat(region,host): support ovs offload. (#15915)
Cloudpods Ovs offload implemention base on SR-IOV support.
IsolatedDevices add field `OvsOffloadInterface`.

Host config add `ovs_offload_nics` to specify which nic enable ovs
offload, and removed flag `disable_sriov_nics` add flag `sriov_nics` to
configure sriov nics.

Tested Hardware: Mellanox ConnectX-5 NICs
2023-02-18 08:48:07 +08:00
wanyaoqi
e34afc1abb feat(host): kvm vcpus bind cpuset on numa nodes and cpu dies (#15732)
- alloc cpuset on guest startup, try to alloc on one numa or cpu die
- fix guest startup task not clean
- fix cgrup tasks not clean
2023-01-08 09:29:12 +08:00
wanyaoqi
f5415697b9 fix(host): convert migrate set downtime value to float64 (#15475)
* fix(host): convert migrate set downtime value to float64

* fix(host): live migration optimize

Qemu will send `STOP` event on start last time migration, but this
moment migration not completed. So we need wait magration complete.
Configurable Auto converge max cpu throttle.
2022-12-08 14:45:37 +08:00
Qiu Jian
e5adbfc19a fix: host disable gso for specific nics 2022-12-04 00:56:17 +08:00
Qiu Jian
1d0f715843 fix: automatically set disk is_ssd according to storage medium_type 2022-11-15 00:14:33 +08:00
wanyaoqi
70755a9a1b feat(region,host): sriov nic support (#15342)
Support sriov nic passthrough to kvm guest.

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>
2022-11-13 18:17:30 +08:00
wanyaoqi
3d0c09195e feat(region,host): hugepage optimize
- host agent will not allocate hugepage on init
- except hugepage, the remaining memory update to reserved memory
- default hugepage size 1024M
- host model add page size field

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>
2022-10-12 14:57:13 +08:00
Davidzkeng
bbaaf2da7e Feature/floppy (#15008)
* feat(region,host): add floppy,mult cdrom

* fix floppy-device

Co-authored-by: huangzekeng <huangzekeng@grgbanking.com>
2022-09-27 21:15:52 +08:00
wanyaoqi
79dfa27ca5 fix(host): live migrate optimize (#14996)
- support set live migrate bandwidth
- support cancel migrate
- live migrate enable multifd
- cancel set live migrate max downtime in init live migrate
- save and sync guest desc and start script on sync config
- init machine pci addresses on load guest desc
- load memory devices for guests not init pci addresses
- attach network set upscript and downscript
- add a guest launcher for redirect qemu stdout/stderr
- start monitor on guest script start
- fix guest hotplug cpu and mem not update desc
2022-09-23 14:09:33 +08:00
wanyaoqi
48ce4b31cd feat(host): describe pci controller and devices in guest desc (#14826)
- describe pci controller and devices in guest desc.
- extend pcie bus and pcie-to-pci bridge and root ports, disks and
nics default attach to pci-bridge for support hotplug.
- generate pci address on guest starts
- live guest desc for running guest, and save source desc

Signed-off-by: wanyaoqi <wanyaoqi@yunion.cn>

Signed-off-by: wanyaoqi <wanyaoqi@yunion.cn>
2022-08-24 16:33:42 +08:00
Jian Qiu
4ccf115bb2 fix: allow start vm on raspberry pi 4 (#14762)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-08-16 19:03:56 +08:00
Jian Qiu
fc1a949c50 feat(region,host): mem clean after guest exited. (#14703)
Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>

Co-authored-by: wanyaoqi <d3lx.yq@gmail.com>
2022-07-25 09:17:43 +08:00
Qiu Jian
dbda136ddc fix: remote duplicate enable_remote_executor option 2022-07-08 09:32:17 +08:00
wanyaoqi
02a9d84d6e fix(region, host): misc fix host health checker (#14575)
- hostagent remove option enable host health, default enabled,
  its rely on etcd endpoint register.
- add option auto migrate on host shutdown.
- fix hostman check network is available.
- use hostname instead hostId as etcd key
- init health checker before host instace init.
- refector host_health checker.

Signed-off-by: wanyaoqi <wanyaoqi@yunion.cn>
2022-07-07 10:28:27 +08:00
wanyaoqi
d51d7599d2 feat(region,host): refector guests cgroups (#14558)
* feat(region,host): refector guests cgroups

hostagent will create a root group cloudpods.hostagent at init.
guests group under cloudpods.hostagent. Qemu start params add '-S'
option freeze guest at first, after guest set cgroup or other initialize
hostagent will resume guest.

host support reserve cpus. if has reserved cpus, hostagent will create a
group cloudpods.hostagent.reserved, one thing to note is set host
reserved cpus must ensure host no guests running.

climc usage:
  climc host-reserve-cpus --cpus '2-3,32-33' --mems '0-1' \
    --disable-sched-load-balance <IDS>

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>

* hostman: disable cpuset balancer

Signed-off-by: wanyaoqi <d3lx.yq@gmail.com>
2022-07-02 18:00:48 +08:00
Jian Qiu
dc3c08e8fe Hotfix/qj tap service mirrorfixes2 (#14476)
* fix: no details for net_tap_flows

* fix: tap support bugfixes

Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-06-16 02:56:18 +08:00
Jian Qiu
14e7e9603c feature: support tap service phase 2 (#14466)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-06-15 05:21:43 +08:00
Qiu Jian
87f84477f1 fix: move hostoptions of qemu_version and host_cpu_passthrough to base
options
2022-06-11 19:23:21 +08:00
Jian Qiu
43d04043a5 feature: fetch options of an application via api (#14398)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-06-04 10:33:31 +08:00
Jian Qiu
618e88c51a fix: backup pack missing encrypt_key_id (#14333)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-05-29 10:12:47 +08:00
Jian Qiu
1c2a4ed9d4 fix: host ping piggyback storage info (#14257)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-05-16 00:34:50 +08:00
Jian Qiu
2dd1aeb8aa fix: mount disk readonly (#13152)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-04-28 09:26:45 +08:00
Jian Qiu
38bd88aa26 fix: always recycle disk files (#14066)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2022-04-20 16:29:30 +08:00
Qiu Jian
c5e0158ea6 fix: change qemu default version to 4.2.0 2022-04-12 12:06:35 +08:00
Qiu Jian
df5bb58318 fix: migrate set downtime 2022-04-05 15:16:46 +08:00
Zexi Li
c0a0121d2b feat(host): server instance snapshot with memory 2022-03-03 13:53:11 +08:00
Qiu Jian
97866bfee0 feature: cleanup native hugepage codes 2022-02-06 17:58:16 +08:00
Qiu Jian
02725606d9 fix: rng device should use urandom instead of random 2022-01-24 23:01:10 +08:00
rainzm
5c6ba0d750 feat(region, host): support disk backup
1. create backup for disk
2. create disk from backup
3. recovery disk directly
4. delete backup
2022-01-19 12:04:27 +08:00
Qiu Jian
41ca44f02e fix: several improvements on qemu migration: 1. enable qemu debug log if
host log_level is debug. 2. disable rng device by default. 3. handle
resume fail post migration
2021-12-30 02:02:35 +08:00
Zexi Li
afd5f0f220 fix(region,host): sync usb isolated device error not display 2021-12-29 17:38:56 +08:00
Zexi Li
8423aa9e63 feat(region,host): usb passthrough 2021-12-28 19:05:42 +08:00
Qiu Jian
9856331086 fix: 1. migration timeout applicable to shared storage migrate only
2. migration timeout logic error
3. cancel host download handler timeout
4. add enable_vm_uuid option
5. add migrate event log notes
2021-12-23 01:34:30 +08:00
Zexi Li
5fd55d9008 optimize(host-image,fetcherfs): download disk speed too slow (#12976) 2021-12-22 00:56:49 +08:00
Qiu Jian
c67f1bde5b fix: server migration halted by an unexpected STOP event 2021-12-18 22:00:06 +08:00
Qiu Jian
718dbd3cb8 fix: add host option for ethtool_enable_gso 2021-11-19 03:14:25 +08:00
Jian Qiu
ae3104e980 fix: add switch of sdnagent tc man (#12697)
Co-authored-by: Qiu Jian <qiujian@yunionyun.com>
2021-11-18 05:16:42 +08:00
Qiu Jian
a970249cc1 fix: delayed probing GPU on host init and add disable_gpu option 2021-10-30 11:46:16 +08:00
Qiu Jian
a8df3cef5a fix: host add disable_kvm option 2021-09-26 17:35:16 +08:00
Zexi Li
ab496d0e4f fix(region): allow guest network zero bandwidth limit 2021-08-23 10:23:14 +08:00
Zexi Li
2b404c89d0 fix(host,region): disable health checker by default 2021-07-29 19:07:30 +08:00
Zexi Li
e770f4f653 feat(host): aware of kubelet eviction config 2021-07-07 21:27:39 +08:00
Yousong Zhou
9ddb01a95d feat(hostman): options: add sdn_allow_conntrack_invalid 2021-01-28 15:21:33 +08:00