Differences between revisions 39 and 50 (spanning 11 versions)
Revision 39 as of 2019-04-13 12:09:31
Size: 18383
Editor: NickBannon
Comment: Example with network
Revision 50 as of 2022-08-16 01:24:11
Size: 19502
Editor: JamesArcus
Comment:
Deletions are marked like this. Additions are marked like this.
Line 34: Line 34:
pveum useradd accmurphy@uccldap -group Administrator -firstname ACC -lastname Murphy -email [email protected] pveum useradd accmurphy@UCCDOMAYNE -group Administrator -firstname ACC -lastname Murphy -email [email protected]
Line 70: Line 70:
Another example, for [[https://wiki.ucc.asn.au/Network#Internal_VLANs|VLAN]] 4: {{{qm create NEW_VM_ID --memory 512 --net0 virtio,bridge=vmbr0,tag=4 --ostype l26 --description "user1 sysadmin workshop" --virtio0 vmstore-ssd_vm:10 --sockets 1 --cores 1 --name user1-NEW_VM_NAME}}} Another example, for [[https://wiki.ucc.asn.au/Network#Internal_VLANs|VLAN]] 4: {{{qm create NEW_VM_ID --memory 512 --net0 virtio,bridge=vmbr0,tag=4 --ostype l26 --description "user1 sysadmin workshop" --virtio0 vmstore-ssd_vm:10 --sockets 1 --cores 1 --pool Member-VMs --name user1-NEW_VM_NAME }}}

The NEW_VM_ID can be allocated with: {{{pvesh get /cluster/nextid}}}

The [[https://pve.proxmox.com/wiki/Qemu-guest-agent|Qemu Agent]] should be set to Enabled, before boot in the Options dialogue, or with: {{{qm set NEW_VM_ID --agent 1}}}

Later, within the running machine: {{{apt install qemu-guest-agent}}}
Line 73: Line 79:
 * the username of the owner
 * the name of the owner
 * the member [[https://www.ucc.gu.uwa.edu.au/member/tla.ucc|TLA]]
   * and/or (name '''and''' username of the owner, plus any additional contact email addresses)
Line 78: Line 84:
 * the hostname of the VM if it's different to what it was named in proxmox
 * the IP address of the VM
 * any other pertinant information that may be helpful to management, such as a contact email address

To edit the notes, **Triple-click** the notes section.
 * the hostname of the VM (if it's different to what it was named in proxmox)
 * the IP address of the VM (if the `qemu-guest-agent` is not installed and enabled)
 * any other pertinent information that may be helpful to management, such as extra SSH authorized_keys access

To edit the notes, '''triple-click''' the notes section.
Line 88: Line 94:
 * Add a user permission for the owner of the VM and give them the role "PVEVMUser". This allows the user to do anything you could do to a physical machine without taking the cover off (with the exception of changing the OS Type). If the user doesn't exist yet, see the [[Proxmox#Authentication|Authentication]] section on how to create it.  * Add a user permission for the owner of the VM and give them the role "UCC_VM_User". This allows the user to do anything you could do to a physical machine without taking the cover off. If the user doesn't exist yet, see the [[Proxmox#Authentication|Authentication]] section on how to create it.
Line 122: Line 128:
Under the summary tab of the newly created VM, go to the notes section and add a comment with the following information:
 * the username of the owner
 * the name of the owner
Under the summary tab of the newly created CT (container), go to the notes section and add a comment with the following information:
 * the member [[https://www.ucc.gu.uwa.edu.au/member/tla.ucc|TLA]]
   * and/or (name '''and''' username of the owner, plus any additional contact email addresses)
Line 126: Line 132:
 * date that the VM can be deleted if it's just for testing
 * the CT's purpose
 * the hostname of the CT if it's different to what it was named in proxmox
 * date that the CT can be deleted if it's just for testing
 * the CT's purpose and reason for choosing a CT over a VM
 * the hostname of the CT (if it's different to what it was named in proxmox)
Line 130: Line 136:
 * any other pertinent information that may be helpful to management, such as a contact email address  * any other pertinent information that may be helpful to management, such as the intended users and administrators

To e
dit the notes, '''triple-click''' the notes section.
Line 168: Line 176:

== Troubleshooting ==
=== Corosync not working ===
Seemingly, if the network restarts abrutly, corosync can get really confused and start flooding the network (and importantly, stop working altogether)
The solution to this condition is to stop the corosync service on all hosts, then bring each one up in sequence.
Line 188: Line 201:
Security is paramount on a VM host because of the high potential for damage if the machine is compromised. Central fail2ban is set up to monitor the webpage and the ssh interface (see [[http://forum.proxmox.com/threads/3583-How-To-implement-Fail2Ban-on-Host]] and [[http://blog.extremeshok.com/archives/810]]), however it is imperative that central logging is configured and TESTED for this to work. The web interface must not be unfirewalled to outside the UCC network under any circumstances. Security is paramount on a VM host because of the high potential for damage if the machine is compromised. Central fail2ban is set up to monitor the webpage and the ssh interface (see [[https://pve.proxmox.com/wiki/Fail2ban]] and [[http://blog.extremeshok.com/archives/810]]), however it is imperative that central logging is configured and TESTED for this to work. The web interface must not be unfirewalled to outside the UCC network under any circumstances.
Line 203: Line 216:
With a single VM host, you would have to configure the storage locations and authentication methods - this is now controlled by the cluster and will be automatically taken care of when you add the node to the cluster. So...not a lot to do except to [[https://pve.proxmox.com/wiki/Proxmox_VE_4.x_Cluster|add the node to the cluster]] With a single VM host, you would have to configure the storage locations and authentication methods - this is now controlled by the cluster and will be automatically taken care of when you add the node to the cluster. So...not a lot to do except:
 *
[[https://pve.proxmox.com/wiki/Cluster_Manager|add the node to the cluster]]
 * Check if adding the `~/.ssh/id...pub` key from the new host to the shared authorized_keys - appears needed with v6.4?

Proxmox VE is used by UCC as a virtual machine management infrastructure, and is a lot like vmware. It is based on KVM and OpenVZ, and is built on Debian, which UCC likes because it's FREE.

Info for Users

Getting a VM

This will require wrangling a wheel member to help you create a machine through the Proxmox interface. Beware the Wheel member may say no to giving you a VM for any reason. Setting up a VM does take some time, so don't expect them to drop what they're doing and create it for you on the spot. The best way to do this is to jump on IRC or email wheel@ucc. Assuming the wheel member gives you a full VM and not just a container, what you should end up with is essentially an empty computer - you will need to install an OS on it and SECURE IT yourself.

Setting Up Your VM

Logging Into The Interface

First, you must be on the UCC network - the web interface is fully firewalled off from the outside internet for security reasons. To get on the UCC network, use a clubroom machine, connect to the UCC wireless, or from anywhere else, connect to the UCC VPN.

For those who know ssh or don't want to connect using the other methods, you can also ssh-forward to medico: ssh -fN -L 8006:medico:8006 motsugo.ucc.asn.au for example.

Once you are on the UCC network, browse to https://medico.ucc.gu.uwa.edu.au:8006 from any modern browser and log in using your UCC username and password.

Installing an OS

To install the OS on your machine, you can either boot an installer from an ISO, or use UCC's netboot setup to install your OS. By default, all machines on the VM network without an OS will netboot, however installing from an ISO tends to be more reliable.

All gumby users (that's you if you're not on wheel) have upload privileges to a store of installer ISOs via the Proxmox web interface. You can mount an ISO from the ISOs storage location via your VM's Hardware tab by selecting the CD drive and clicking edit. If you have problems uploading to the ISOs storage area though the web interface, contact your friendly neighbourhood Wheel member and they can put it directly into /services/iso.

Once You Have Installed Your OS

There are a couple of things you need to tell the person who set up your machine - its hostname, its MAC address and its IP address. Then they will be able to set up DNS and a static DHCP entry. You will also need to let them know which firewall ports you need unblocked and which services you are running if you wish to run any externally accessible services on your machine - we generally don't let non-wheel members do their own firewalling. Gumby users must also nominate a wheel member who can have root ssh or console access on your machine for auditing purposes - for ssh their key then needs to be copied into /root/.ssh/authorized_keys or for console access they must have a fully privileged local account.

Securing Your Machine

TODO - ask a Wheel member what to do for now.

Info for Administrators

Authentication

Out of the box, the web interface uses the username root and the root password of the host. The LDAP implementation in Proxmox isn't "true" LDAP in that Proxmox only looks at LDAP for authentication and cannot consult LDAP for a list of users or group permissions. Other users can be added by creating their username in the web interface and setting the authentication realm to UCC's LDAP. The username must correspond to a UCC LDAP username.

To add yourself to the administrator's group, SSH to medico, and run something like:

pveum useradd accmurphy@UCCDOMAYNE -group Administrator -firstname ACC -lastname Murphy -email [email protected]

Alternatively, get another administrator to create your user through the web interface. Contact wheel if you are unable to login twice in a row or you will be locked out.

Storage

Virtual machines should be stored as .raw images in the appropriate vmstore area. Storage is managed at the cluster level, so every storage device is available to (or created on) every node, unless it is explicitly limited to a particular node or group of nodes. This is necessary in order to migrate machines to different nodes without having to backup and restore the VM's disk to a different path.

On the Atlantic cluster, there are several vmstores per node;

  • /var/lib/vz is local on every node and is part of the default install and it cannot be removed. This permits storing VMs and containers locally on a node, but does not allow live migrations. Beware that each node may have its local disks set up with different levels of reliability and amounts of space. Avoid using local storage if at all possible.
  • /vmstore-nas is a RAID1/0 volume on Molmol mounted over NFS. It can be used for VMs and containers as it has plenty of space

  • loveday has additionally got its own single spinning disk as storage for VMs while the machine is at camp. The only way to get VMs on to it is to stop them and migrate them to this storage. It is not in raid - do NOT use it for day-to-day use.

Adding VMs

These instructions are for creating a VM for a general UCC member - do not blindly follow them for UCC servers.

To create the actual machine:

  1. Log in as an administrator
  2. Click on "Create VM" in the top right corner
  3. Set Name as the hostname of the VM, set the resource pool to Member-VMs and click next
  4. Select the desired OS for the VM and click next
  5. Select "do no use any media" (unless media has already been decided and uploaded to the correct location) and click next
  6. Set the following options for the hard disk:
    • Bus device: VIRTIO 0
    • Storage: nas-vmstore
    • Disk size: 50
    • Format: raw
    • Cache: default (no cache)
    • Then click next
  7. Set the number of cores to 2 (leave everything else at default) and click next
  8. Set memory to 2GB and click next
  9. Set the network model to VirtIO (paravirtualised) and set the VLAN tag to 4 the member VM VLAN. NB: for VLAN 2, set "No VLAN" or things will break
  10. Click next and then finish

The UI is a bit buggy. [SJY] wasted hours trying to work out why the Storage selector (in the Hard Disk tab) was grayed out, preventing the creation of new VMs. If you experience this issue (and storage is online, ie. other VMs are working correctly) then your best bet is probably just to create the VM on the command line with something like qm create NEW_VM_ID --name NEW_VM_NAME --virtio0 nas-vmstore:0,format=raw --bootdisk virtio0 --ostype l26 --memory 2048 --onboot yes --sockets 1 --cores 2 (this command will not configure a network interface) and then tweak the config in the web UI.

Another example, for VLAN 4: qm create NEW_VM_ID --memory 512 --net0 virtio,bridge=vmbr0,tag=4 --ostype l26 --description "user1 sysadmin workshop" --virtio0 vmstore-ssd_vm:10 --sockets 1 --cores 1 --pool Member-VMs --name user1-NEW_VM_NAME 

The NEW_VM_ID can be allocated with: pvesh get /cluster/nextid

The Qemu Agent should be set to Enabled, before boot in the Options dialogue, or with: qm set NEW_VM_ID --agent 1

Later, within the running machine: apt install qemu-guest-agent

Under the summary tab of the newly created VM, go to the notes section and add a comment with the following information:

  • the member TLA

    • and/or (name and username of the owner, plus any additional contact email addresses)

  • date of creation
  • date that the VM can be deleted if it's just for testing
  • the VM's purpose
  • the hostname of the VM (if it's different to what it was named in proxmox)
  • the IP address of the VM (if the qemu-guest-agent is not installed and enabled)

  • any other pertinent information that may be helpful to management, such as extra SSH authorized_keys access

To edit the notes, triple-click the notes section.

Under the options tab:

  • Change "Start at boot" to yes

Under the permissions tab:

  • Add a user permission for the owner of the VM and give them the role "UCC_VM_User". This allows the user to do anything you could do to a physical machine without taking the cover off. If the user doesn't exist yet, see the Authentication section on how to create it.

The VM should now boot, however it is essentially a blank machine and will netboot.

The VM will get an IP address from DHCP, however this should be set to a static entry in Murasoi as soon as the mac address of the VM is known in order to avoid conflicts. Also add a DNS entry on Mooneye as you would for a physical UCC machine.

Adding Containers

The information below has not been updated for Proxmox 4, which notably uses LXC containers instead of OpenVZ. One major drawback of LXC containers is that they cannot currently be live migrated. Use containers at your own risk - you WILL have more outages and you WON'T be warned before they are turned off.

An OpenVZ Container, or CT, is a paravirtualised environment. It is more like a chroot on steroids than a full virtual machine, and it uses the host kernel but a separate userland environment. Container technology allows you to set a quota on disk, memory and CPU usage, but unused resources can be shared. If you just need a clean environment to run a few daemons or test something out, a container makes better use of our resources.

  1. Log in as an administrator.
  2. Click on "Create CT" in the top right corner
  3. Set the following general options:
    • Name: the hostname of the VM
    • Resource pool: Member-VMs (or as appropriate)
    • Storage: nas-vmstore (or as appropriate - see Storage above)
    • Password/confirm password: the root password for your new container
  4. Click Next.
  5. Select a template (base image for the operating system). debian-7.0-standard or similar is probably the way to go.

  6. Click Next.
  7. Set appropriate resource limits. Remember that these are maximums, not guaranteed minimums, so you can set them quite high.
    • Memory: 2048 MB
    • Swap: 512 MB
    • Disk size: 50 GB
    • CPUs: 2
  8. Click Next.
  9. Network: unfortunately the UI for setting network options in containers is not as full-featured as VMs; in particular, there is no way to set a VLAN tag through the web UI.

    • If a machine room IP is appropriate (probably not), you can add that straight in as a 'Routed mode' IP, or do static configuration with 'Bridged mode' to vmbr0.

    • To use the more appropriate clubroom or VM networks, we will have to come back later. Choose 'Bridged mode', and continue once the container is created to edit the configuration on the command line.
  10. Click Next.
  11. Leave the DNS settings alone; click Next.
  12. Click Finish once you are happy with the settings. The container will be created, the template unpacked and the appropriate settings applied.

Under the summary tab of the newly created CT (container), go to the notes section and add a comment with the following information:

  • the member TLA

    • and/or (name and username of the owner, plus any additional contact email addresses)

  • date of creation
  • date that the CT can be deleted if it's just for testing
  • the CT's purpose and reason for choosing a CT over a VM
  • the hostname of the CT (if it's different to what it was named in proxmox)
  • the IP address of the CT
  • any other pertinent information that may be helpful to management, such as the intended users and administrators

To edit the notes, triple-click the notes section.

Under the options tab, set 'Start at boot'.

To manually manage the network configuration, keep following these directions:

  1. Take note of the container ID - the number next to the hostname in the description at the top of the screen or in the left-hand server list. The number 999 is used below; replace this with the appropriate ID.
  2. Log on to medico as root via SSH.
  3. Run the following command (with the correct container ID) to wipe out the existing network configuration:

vzctl set 999 --netif_del all --ipdel all --save
  1. Choose the correct bridge interface. For VLAN 3 (clubroom), use vmbr0v3 and for the VM network use vmbr0v4.

  2. Run the following command to add a new bridge, with the appropriate bridge device and container ID:

vzctl set 999 --netif_add eth0,,,,vmbr0v4 --save

You can now start the container and log in using the console.

You will probably need to set up the interfaces as you normally would; add something like this to /etc/network/interfaces on Debian:

auto eth0
iface eth0 inet dhcp

Permissions

If a user has more than one VM, it is worth creating a dedicated resource pool for that user. A resource pool is just a way of grouping several VMs together and allows permissions to be applied to the pool, which then propagates to all VMs in that pool. Create a pool by going to Datacenter->Pools->Create. After the pool appears in the menu tree, click on the pool and add any existing VMs for the user to that pool. Don't forget to then add PVEVMUser permissions to the pool.

Resizing VMs

Resizing disks can be done through the web interface by going to the VMs hardware tab, selecting the hard disk, and then selecting resize. Note this only allows the growing of disks - there is no way to shrink a volume once it has been grown aside from copying the data to a new image. Only some OS's will recognise a size change online, so the VM may need to be rebooted. Also note that resizing the disk will not resize the partitions or file systems, this is extra and out of scope of this page. See http://pve.proxmox.com/wiki/Resizing_disks and https://pve.proxmox.com/wiki/Resize_disks for more info.

Command Line Management

Virtual machines are managed using the qm tool. Containers are managed using the pvectl tool (though you can use vzctl as well.

There is more information on command-line tools on the Proxmox wiki.

Troubleshooting

Corosync not working

Seemingly, if the network restarts abrutly, corosync can get really confused and start flooding the network (and importantly, stop working altogether) The solution to this condition is to stop the corosync service on all hosts, then bring each one up in sequence.

Info for Installers

Installation

Proxmox can be installed using either a baremetal installer iso or an existing Debian installation (check kernel versions as Proxmox replaces the existing kernel). The problem with the baremetal installer is that it does not allow you to set up your own logical volumes and doesn't give you the option of software raid. IT WILL ALSO EAT ANY OTHER DISKS ATTACHED TO THE MACHINE, disconnect disks you don't want lost if using the baremetal installer. So machines such as Medico and Maltair had Proxmox installed on top of pre-installed Debian. Ensure the version of Debian you're installing is compatible with the version of Proxmox you want.

Installation is incredibly easy by following the instructions in the Proxmox VE Installation Page. Ensure that the Debian install follows the UCC almost-standard layout, with separate rootusr, var, boot, and home logical volumes. Put /var/lib/vz in its own logical volume, as this is where local VMs are stored by default.

Things missed by the manual installer

  • The notable instruction that is missing in the wiki page is to enable Kernel Samepage Merging (KSM) on the host, which is a memory de-duplication feature - google how to enable it and enable it with a line in /etc/rc.local (check Motsugo's for an example)

  • The proxmox installer fails to change the network configuration file to be suitable for virtual machines; check out the default configuration in Proxmox Network Model and modify /etc/network/interfaces to suit.

  • All the other items on the SOE page, with the exclusion of LDAP, NFS, dispense and most of the other user programs.

  • IPv6 configuration. Look at Motsugo's or Medico's config for an example.

  • Set up fail2ban on the web and ssh interfaces
  • Add entries for all other nodes and the storage server to the hosts file. We don't want the cluster dependent on DNS.
  • Uninstall rdnssd if you want dns to work. It clobbers the resolv.conf file with an ipv6 address and is generally a pain in the butt. Then set dns servers and search domains in /etc/network/interfaces
  • If using mirrored raid, install grub on both disks so things will still boot in the event of a disk failure.

Security

THIS IS CRITICAL FOR THE NODE TO FUNCTION IN THE CLUSTER, DO NOT IGNORE THIS STEP. Do not push wheel keys to the node - Corosync (the tool that replicates the cluster configuration across all nodes) will look after syncing wheel keys to the node once it's added to the cluster, and the push script is configured to only push to a single node in order to maintain wheel keys. Part of adding the machine to the cluster creates keys that allow every node root access to every other node, and these are appended to the authorized keys file, so you need to add the root key for the new node to the extra-maltair file before push.sh is run again, else the root key will get overwritten.

Security is paramount on a VM host because of the high potential for damage if the machine is compromised. Central fail2ban is set up to monitor the webpage and the ssh interface (see https://pve.proxmox.com/wiki/Fail2ban and http://blog.extremeshok.com/archives/810), however it is imperative that central logging is configured and TESTED for this to work. The web interface must not be unfirewalled to outside the UCC network under any circumstances.

NFS mounts are initially forbidden for containers by AppArmor: https://unix.stackexchange.com/questions/396678/access-denied-when-trying-to-mount-nfs-share https://forum.proxmox.com/threads/nfs-file-system-mount-problem-apparmor.31706/

$ dmesg -T
...
[Mon Feb 19 13:46:44 2018] audit: type=1400 audit(1519047865.509:122): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-container-default-cgns" name="/away/" pid=19745 comm="mount.nfs" 
# vi /etc/apparmor.d/lxc/lxc-default-cgns
...
  mount fstype=nfs*,
  mount options=(rw, bind, ro),
# systemctl reload apparmor

Post-install Configuration

With a single VM host, you would have to configure the storage locations and authentication methods - this is now controlled by the cluster and will be automatically taken care of when you add the node to the cluster. So...not a lot to do except:

  • add the node to the cluster

  • Check if adding the ~/.ssh/id...pub key from the new host to the shared authorized_keys - appears needed with v6.4?

See Also

The SOE says how to do some of the things this page tells you to do.