xend refusing to start March 10, 2008

Posted by idimmu in linux.
We recently had a few power outages at work, some scheduled, some not, and this played havoc with our xen servers.

One of the problems we had was that xend would not start (and thus xendomains would also not start).

Checking /var/log/xen/xend.log gave us the following snippet:


inst = XendNode()
File "/usr/lib/python2.5/site-packages/xen/xend/XendNode.py", line 164, in __init__
saved_pifs = self.state_store.load_state('pif')
File "/usr/lib/python2.5/site-packages/xen/xend/XendStateStore.py", line 104, in
load_state
dom = minidom.parse(xml_path)
File "xml/dom/minidom.py", line 1913, in parse
File "xml/dom/expatbuilder.py", line 924, in parse
File "xml/dom/expatbuilder.py", line 211, in parseFile
ExpatError: no element found: line 1, column 0
[2008-03-10 21:37:40 18122] INFO (__init__:1094) Xend exited with status 1.


A quick google of that error revealed several people that had come across the same problem, but no actual answer!

It looks like xen is having problems parsing an xml file, so some quick mental inspiration, and the find command, yielded /var/lib/xend/state/pif.xml which was a 0 byte file! A comparison to a working server showed that it should (or atleast could) contain this:


<?xml version="1.0" ?>
<pifs/>


A copy and paste later and we had a working xend! However it refused to create any of the xenlets:


root@xen0:/etc/xen# xm create server0.cfg
Using config file "./server0.cfg".
Error: The privileged domain did not balloon!


Despite their being plenty of RAM!


root@xen0:/var/log/xen# xm list
Name ID Mem VCPUs State Time(s)
Domain-0 0 7928 8 r----- 832.8
root@xen0:/var/log/xen# free
total used free shared buffers cached
Mem: 8119416 393028 7726388 0 11344 58832
-/+ buffers/cache: 322852 7796564
Swap: 15631224 0 15631224


An strace of the process revealed xen did think it had less memory available than it actually had ..


[2008-03-10 21:47:48 18620] DEBUG (__init__:1094) Balloon: 131064 KiB free; 0 to scrub;
need 524288; retries: 20.


As we had a working xend finally we decided to implement a technique we'd learned from working with Windows machines and rebooted the server. This magically fixed the memory issue, it would have been nice to know what actually caused it and if there was a proper fix though.

Tags

Friends

twitter

  • @jooli2 I just wanted to see what @dpashley looked like after a pint of the black stuff, turns out he looks ugly!
  • looking forwards to performing at Burning The Clocks next Wednesday :o Come all and come watch!
  • @journoannie are they giant sad puppy eyes, as that could be kind of cute? Although the whole self pity thing is a bit of a turn off!
  • Elgg 1.8 Tidypics Group Fix http://t.co/C2D56UsH
  • BackupPC ping too slow http://t.co/9Na2PxKs

lastfm

  • Bogart Shwadchuck – Bitch Go Buy Me A Hot Dog (I'll Be Waiting Here, Doing The Robot)
  • Ill Nillas – What Up Bitches
  • M9 – Mental Prison (Feat. Phoenix Da Icefire) (Produced By Chemo)
  • Therapy? – A Moment Of Clarity
  • Therapy? – Unbeliever
  • Therapy? – Die Laughing
  • Finger Eleven – Swallowtail
  • Equilibrium – Mana
  • Equilibrium – Dämmerung
  • Equilibrium – Ruf In Den Wind

IdleRPG Stats

  • 1 webvictim 57
  • 2 HRH_H_Crab 57