The Darker side of Linux: 2010

Saturday, October 2, 2010

VDE and QEMU

Networking many qemu-kvm on one hosting machine or Virtual machines with Virtual networking

VDE - Virtual Distributed Ethernet Create a virtual switch in your machine with software.
QEMU - is a generic and open source machine emulator and visualizer
QEMU-KVM - QEMU with Linux Kernel virtual machine, using hardware in processors that support emulations (very fast).

I have been using Qemu-kvm for the last 2 years and the biggest pain is setting up networking for a group of virtual machines, there is a way to set up a tun and bridge interface for each virtual machine but it is a pain and when you display all of the network machines it is a mess to take in. VDE seems like a way to simplify this networking mess, with VDE you can set up a virtual switch that has one tun to connection and many QEMU-KVM connections to VDE. All of this is not perfect and may require some bridge creation and or FW rules.

Setting up VDE

There is a VDE daemon that must be started to bring up a switch:
vde_switch -tap tap0 -daemon
This daemon must be running to use vde to run qemu, you can bring it up when you start up your machine or with a script as you start up your qemu-kvm machines. Please note that this will create a pipe file in /tmp/ or /var/tmp , depends on installation of vde and parm overrides.

Creating a Linux bridge interface

To get the vm machines to communicate out you can set up routing in linux with a FW SNAT or you can set up a bridge interface over you active network interface. The bridge option is a bit easier and better if you want to connect to your virtual machines from the outside of your host machine. You can create a br0 device through your network start up (Using your normal distro startup) or by hand (using brctl utils) then attach your active network interface to it and the tap0 device and you are done with the network setup, well almost. Note: Distro networking is notorious confusing and worthless, yes people (in most cases people running old distros that google search works well for) do get it to work , but for the rest of us it's not worth the pain since setting it up requires us to understand the start scripts written by some bash hacker of dubious intentions. The doc from brctl is easy to understand thus why bother with anything else.

Using vde with qemu-kde

To put all of this together you need to run qemu-kde under vde and the best way is with this format of a command:

vdeq qemu-kvm -net vde,vlan=0 -net nic,model=virtio,vlan=0,macaddr=52:54:00:00:AA:01 -curses -drive file=/space/images/linux-amd64-2.img,snapshot=off,cache=none,if=virtio,boot=on -smp 2 -m 1G -monitor telnet:127.0.0.1:9221,server,nowait

When running qemu-kvm under vdeq you can add network options IE "-net vde". With this option qemu will connect to a free port on the vde_switch which is now bridged with the br0 device and has free access to the internet/lan . The one trick here is defining the mac address "macaddr=
, this needs to be done for each VM you inter connect to the switch, else all machines will get the same default mac address, the rest of the command is pure qemu and I leave that to the reader to figure out.

Running multiple network interfaces

One last trick is running qemu-kde with vde and having eth[1-100]. The standard way of doing this is bring up multiple vde_switch daemons and connect to the different UNIX sockets of the multiple switch daemons and set up multiple bridges to connect to which switches and etc . This can get very complex very fast and in most cases it is not needed. The quick way to do this is to use the vlan=X option in vde and use the same vde_switch for each vlan option , a example :

vdeq qemu-kvm -curses -drive file=/space/images/linux-amd64-1.img,cache=none,if=virtio,snapshot=off,boot=on -smp 2 -m 1G -monitor telnet:127.0.0.1:9222,server,nowait -net vde,vlan=0 -net nic,model=virtio,vlan=0,macaddr=52:54:00:00:AA:03 -net vde,vlan=1 -net nic,model=virtio,vlan=1,macaddr=52:54:00:00:AA:04

Note the vlan=0 and vlan=1, this creates a different interface for the qemu-kvm and rides the networks on the same vde_switch, IE you networks can not conflict for this to work address/subnets/netmasks but if that is clean , as it should be, you will be running two/3/4 different networks on the same switch ( There may be security issues with this and thus you may be forced not to do this ) . The vlan= option gives you a easy way to simulate multiple networks between multiple VM's .

Summay

Using vde with qemu-kvm will simplify putting many qemu-kvm machines up on the same machine and make networking very straight forward for the VM's and the hosting machine. For me its a breath of fresh air for networking simple and direct. Anyone with any other networking gems for Linux ? Any other experiences with vde/qemu/kvm ?

Friday, August 20, 2010

Moving to corosync and pacemaker

After using Heartbeat 1.X for the last 4 years, it looked like it was time to move to the next generation. Heartbeat it's self is no longer the package it was, thus in looking at the projects out there Corosync with Pacemaker seems like the future.

Using Heartbeat

Most of the machines using Heartbeat were just 2 machine quorum type clusters sharing IP(s) between them, thus this should be a simple move, ya right not. The move to XML and configure options hell and poor examples made this a bit harder then you might think, not that Heartbeat was ever easy. When Heartbeat would burn you it would burn hard because these were machines that were never to go down, and that is always the problem upgrading machines that were never to go down.

Finding the documentation

The heartbeat web site is mostly not being maintained any more and for the 2.X versions the doc well sucked. Corosync corosync.org site looked very nice and does give you enough to get Corosysnc up but not much else. Pacemaker site http://clusterlabs.org/ is good but there are so many options and the doc goes on and on, so it is useful but as a intro doc NOT. So it's off to Google to go looking through this and that to put something together that will get this going.

Working with and making it work

Heartbeat 1.X was only for 2 machine cluster configurations but Corosync/Pacemaker was for much more, which makes it much more interesting and challenging to work with. In learning the configurations options of Corosync and Pacemaker I was faced with understanding the flexibility and expandability of the 2 products thus making it a slow process. To start things you first have to get Corosync up. Corosync has a standard configuration file /etc/corosync/corosync.conf that has a stanza format which seems popular today note the totem stanza this is important.
Corosync.conf:

compatibility: whitetank

totem {
        version: 2
        secauth: on

threads: 0
        interface {
                ringnumber: 0
                bindnetaddr: 192.168.254.2
                mcastaddr: 226.94.1.1
                mcastport: 5405
        }
}

logging {
        fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

amf {
        mode: disabled
}
aisexec {
    user:  root
    group: root
}
service {
   name: pacemaker
   ver: 0
}

The thing to understand here is that Corosync is the networking part of the puzzle , so it's configuration cares about IP's address and secure communications and who is in the cluster at a communications layer only. It does launch Pacemaker but that is it for the applications side . There is still a lot of info to take in here and lots of ways to configure Corosync but I'm not going to talk about that all.

Pacemaker config overview

Pacemaker does not have a configurations file, it has a command line interface called crm.   Crm has it's own command language that can be used as a one line command or in a shell format. That said I only want to talk about the new concepts that Pacemaker brings. Pacemaker has this idea of having many cluster machines in it's cluster thus how do you control what application runs on what machine . This is done with a priority # system that you can effect with it's control language.

group gr_haproxy IP_haproxy cf_haproxy \
        meta target-role="Started"
location cli-prefer-gr_haproxy gr_haproxy \
        rule $id="cli-prefer-rule-gr_haproxy" inf: #uname eq lb08
location loc_IP_bcom IP_bcom 75: lb08

This cut of control language defines a group , which ties together objects applications to run in a order and on the same machine. The statement locations is a more direct way influence applications to run on a defined machine in example statements I am changing the priority # to move the applications to the LB08 machine. There are also order statements to better order how applications should be started and what depends on what.

A work in progress

So far so good as they say, I have converted our old Heartbeat machines to new Corosync/Pacemaker on about 5 clusters, with out real issues and I am so far impressed with it's workings and speed. I am still working on understanding how to take advantage of the power of this pair and as always looking for help , any comments or hints out there ?

Monday, June 7, 2010

Using Bluetooth on Linux

I love Linux in most cases but the bluetooth package on Linux BlueZ is just a mess. The first problem was because of a rewrite of BlueZ product to run under Dbus and UDEV and then it was expected that the applications developers would come and build GUI apps for it but the only app seems to be blueman and a kde one, and blueman sucks and I just refuse to build out kde any more, being the massive thing it is .

BlueZ http://bluez.org has like no documentation, now connecting a strange device is never easy but with no documentation it just sucks, and the debug flags are there but don't produce very many messages thus setting this up again sucks. Now if you google this you will not find a hole lot of current documentation, yes lots of doc for the old bluez but almost nothing for the new bluez.

I have connected in the past and current GPS devices using BT and rfcomm and this still seems to work without issues, but just try to create a network using pand with GN, OMG a little doc would be nice. After 2 days of work I got it working and it wasn't that hard, but if you Google stuff like this you just get people stuck in the networking end which is easy, but not the BT connecting end which is all magic with no doc. It turns out that the new Bluez still has the old daemons and utils in it, on Gentoo you need some magic flags and out they come. Now with the old daemons and utils it kinda follows the old documents and some new documents and there are even a few examples and up comes a network. The one thing that took me 2 hours was figuring out that you need to run something like a simple-agent as root on the server machine to auth the connection. Once that was found up came the network and it works, the rest is just normal Linux routing/networking which is normal/easy for me.

Of course where is my document for all of this ? I'm just keeping it in a safe place for now ...

Saturday, March 13, 2010

Running remote servers

How do you run a server in a remote location ?

There are many products out there to do this but most have high prices and tend to not work well when you start having more then 10 servers. The organization and wiring nightmares can make this a non working solution very quickly . One of the biggest mistakes a organization can make is in thinking that the most costly hardware will solve this problem, but in most cases it just moves the blame or responsibilities to someone or something else then it should be. Organizations need to put real commitment when they go remote to 20-30% more money and personal time in setting up and running a remote server farm. There is also the problem of getting you remote equipment working and configured correctly and of course in a remote location you don't and can't change physicals just work with whats there, so not having a full test setup non-remote will cause failure. When you do have physical hand come into your remote location you want to make the best of it that, which will require lots of test time and configure time before the change so that all is worked out before the physical work is done. Most of the ideas here are not always done thus most remote servers farms needs lots of physical work and re-work to keep running well. After a while the idea of doing remotes is not something that shops want to do because of the problems of keeping it going. In general it must be noted that all servers fail and when they do at best you can only catch about 90% of the fails the rest have to be diagnosed on site.

Tools for a remote server:

KVM - Most people thing about KVM's but very $$$$ and don't always work well, there are models out there that work good once you configure them right which can takes hours and working remote is always different. In most cases there are never enough ports to do you any good.
Remote Serial port managers - These can also be $$$ but not as bad as most KVM's but they do require a Motherboard/BIOS that supports this and there are a few brands (server class) that will do this. There are also lower cost options out there that are just port switchers this works but you need a control machine or telnet box to control it.
BMC / SMDC /IPMI - cards . On some servers this is a included option on others it is a small expense but for remote power on and serial card access and console/command access this may be all you really need. The hardest part is the setup which can be through the bios or through the OS driver support. Linux has good BMC support and good IPMI support . Tyan motherboards support this well and work well with a IPMI card, I've seen this on most Dells also.
Remote power strips - in most cases this is $$ options and it only gives you the power plug and sometimes some indication that the machine is drawing power. So not a lot of help in fixing broke machines.

Friday, February 5, 2010

Suffing Linux on a Toshiba t135D

My adventures with installing Linux on a Toshiba Satellite T135D :

Specs :
VISION Premium Technology from AMD Featuring
• AMD Turion™ Neo X2 Dual-Core Mobile Processor L625
o 1.6GHz, 1MB L2 Cache, 1.6GT/s
• AMD M780G Chipset
• ATI Radeon™ HD 3200 Graphics with 256MB-1919MB dynamically
allocated shared graphics memory
Memory
• Configured with 4GB DDR2 800MHz (max 8GB)
• 2 main memory slots. Both slots may be occupied.

Storage Drive
• 320GB (5400 RPM) Serial ATA hard disk drive
• TOSHIBA Hard Drive Impact Sensor (3D sensor)

Display
• 13.3” diagonal widescreen TruBrite® LED backlit TFT LCD display at
1366 x 768 native resolution (HD)
o Native support for 720p content
o 16:9 aspect ratio

Sound
Intel HD Audio

Communications
• Webcam and microphone
• 10/100 Ethernet (Atheros mod ATL1E)
• Wi-Fi® Wireless networking (802.11b/g/n) (Realtech rtl8192se)

Linux Distro Gentoo kernel 2.6.32.7 AMD 64bit

What works good, what works ok and what doesn't work.

What works good:

X11 with the radeonhd driver with no 3d (did not try 3d)
Suspend / resume using s1 (to bios)
Wire ethernet Atheros module ATL1E worked without issues.

What works OK:

Mouse pad with Synaptics driver trying to get middle button working right is not fun . After a week it still does strange things from time to time.
Synaptics parm of AccelFactor should be set to something like 0.1 this helps alot.

What does not work:

Wireless Realtech rtl8192se this just is not stable yet, the latest version worked for about 20 Min then kernel trap. I found the driver on Realtech site and this ubuntoo site: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/401126?comments=all . There has been alot of activity so maybe it will get fixed.
I tried to put a 60G OCZ SSD in this and it worked great but Suspend /Resume would not always power the drive back up, I suspect a BIOS issue but there are no new versions of the BIOS yet out, that I could find.
I have the 60G OCZ SSD in again after a firmware upgrade seems better but once in a suspend it doesn't seem to come back. For now I am living with it.
Sound out of speakers works, but soft, headphone will not mute speakers, no mic works
Skype works with USB cam without issues use uvcvideo driver, but no int mic works.
Using the 2.6.33.3 kernel the headphone will not mute

Usb stuff that works :

ralink USB b/g/n with driver rt2870sta 500K + downloads, no problems
USB 5 channel Audio works great with alsa.

Misc stuff:

Runs very hot sometimes you need something on you lap to not get burned , then after a few hours the mouse seems to go crazy I think the hole unit is overheating this is a classic Toshiba problem IE Overheating . If I run it at 800 mhz it does not overheat but runs a lot slower
It is fast for a laptop, battery time is almost 4 hours but depends on how hard CPU is running.

Conclusion:

Wireless sucks, Sound sucks, Overheats, Mouse support sucks, SSD sucks, this laptop sucks for Linux. The MSI u230 is starting to look good they use a realtech sound chip, athro wireless, and a normal type of mouse and they have support of SSD, maybe this is the way to go.

The Darker side of Linux