yum error: Couldn’t fork Cannot allocate memory

Yum Couldn't Fork Cannot Allocate Memory I’ve been doing some awesome things to a new VM for work, namely installing CouchDB, Apache and running Node.JS apps along side a WordPress plugin using Angular.JS. It’s pretty cool. But computer’s are dicks so when it came down to installing Monit to ensure everything was lovely I got the following error: Couldn’t fork %pre(monit-5.5-1.el6.rf.x86_64): Cannot allocate memory. Bum.

error: Couldn’t fork %pre(monit-5.5-1.el6.rf.x86_64): Cannot allocate memory

Seem’s simple enough, for whatever reason Yum cannot allocate memory, so lets take a peak


root@bridge opt]# free
total used free shared buffers cached
Mem: 1020376 832736 187640 0 3988 81256
-/+ buffers/cache: 747492 272884
Swap: 0 0 0

Man there’s totally enough memory there, 187MB of RAM is free, Quake took less than that and is way more complicated than some stupid RPMs.. maybe it’s something else!

Quite often this error is caused because the RPM database has duplicates or got corrupted in some way, so lets try and clean that up.


[root@bridge ~]# package-cleanup --cleandupes
Loaded plugins: fastestmirror, protectbase
Loading mirror speeds from cached hostfile
* base: mirror.checkdomain.de
* epel: mirrors.n-ix.net
* extras: mirror.checkdomain.de
* rpmforge: mirror1.hs-esslingen.de
* updates: mirror.checkdomain.de
1490 packages excluded due to repository protections
No duplicates to remove
[root@bridge ~]# rpm --rebuilddb

Well no duplicates and the RPM database is all cool, so lets try again ..


[root@bridge ~]# yum install monit
Loaded plugins: fastestmirror, protectbase

Running Transaction
Error in PREIN scriptlet in rpm package monit-5.5-1.el6.rf.x86_64
error: Couldn't fork %pre(monit-5.5-1.el6.rf.x86_64): Cannot allocate memory
error: install: %pre scriptlet failed (2), skipping monit-5.5-1.el6.rf
Verifying : monit-5.5-1.el6.rf.x86_64 1/1

Failed:
monit.x86_64 0:5.5-1.el6.rf

Complete!

Man, haters gonna hate!

Solving error: Couldn’t fork %pre(monit-5.5-1.el6.rf.x86_64): Cannot allocate memory

Ok, lets step back a minute and assume the error is legit, lets turn some stuff off ..


[root@bridge ~]# /etc/init.d/couchdb stop
Stopping database server couchdb
[root@bridge ~]# /etc/init.d/httpd stop
Stopping httpd: [ OK ]

And try again!


[root@bridge ~]# yum install monit
Loaded plugins: fastestmirror, protectbase

Downloading Packages:
monit-5.5-1.el6.rf.x86_64.rpm | 267 kB 00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Installing : monit-5.5-1.el6.rf.x86_64 1/1
Verifying : monit-5.5-1.el6.rf.x86_64 1/1

Installed:
monit.x86_64 0:5.5-1.el6.rf

Complete!

Sweet that did it. So it was a bonafide legit error and shutting some services down freed up enough memory to allow us to install RPMs again.


root@bridge ~]# free
total used free shared buffers cached
Mem: 1020376 510972 509404 0 11632 146780
-/+ buffers/cache: 352560 667816
Swap: 0 0 0

mmm 509MB free, thats a lot more.. I guess Yum actually needs a ton of RAM to actually do anything. Weird. If you guys get this problem, try turning some services off and on again 😉

Speeding Up MDADM RAID Rebuilds

Speeding Up MDADM RAID Rebuilds I’m slowly migrating a bunch of awesome things from a really old server, it’s still running Ubuntu 10.04.. to a really nice and shiny one. Which has 2 new 3TB HDDs in RAID 1, which are syncing..


cat /proc/mdstat
md3 : active raid1 sda4[0] sdb4[1]
1847478528 blocks super 1.2 [2/2] [UU]
[>....................] resync = 0.1% (2061952/1847478528) finish=28857.9min speed=1065K/sec

That’s like 40 days of syncing.. surely we can do better than that? Can’t we speed up this MDADM RAID 1 rebuild?

Speeding Up MDADM RAID Rebuilds

Sure we can, we’re awesome, I’m awesome, you’re awesome, can I get a hell yeah? HELL YEAH!

Likes lots of things that are tuneable, we have to do NASTY things to /proc. This bit sets the minimum speed we want mdadm to go at.


cat /proc/sys/dev/raid/speed_limit_min
1000

We’re greedy though and in the words of Tim Tool Time Taylor, MORE POWER. As root..


echo 50000 > /proc/sys/dev/raid/speed_limit_min

We’ve just made mdadm run FIFTY TIMES faster. FOR FREE. We didn’t even swap out the SATA disks for SAS SSDs, we just changed a number ..


md3 : active raid1 sda4[0] sdb4[1]
1847478528 blocks super 1.2 [2/2] [UU]
[=====>...............] resync = 26.0% (480866560/1847478528) finish=141.9min speed=160501K/sec

Now 10 minutes later we’re already over 1/4 the way through, 2 hours left baby. Hell yeah.

authorized_keys vs authorized_keys2

authorized_keys vs authorized_keys2 Earlier today I was setting up a brand new server for a migration and just as I was typing scp .ssh/authorized_keys2 my brain went and asked a question..

What is the difference between authorized_keys and authorized_keys2?

I’ve been working with Linux for well over a decade and some of my practices stem from things I learned in the ’90s that still work, putting all my public keys in ~/.ssh/authorized_keys2 is one of those things.

authorized_keys vs authorized_keys2

In OpenSSH releases earlier than 3, the sshd man page said:

The $HOME/.ssh/authorized_keys file lists the RSA keys that are permitted for RSA authentication in SSH protocols 1.3 and 1.5 Similarly, the $HOME/.ssh/authorized_keys2 file lists the DSA and RSA keys that are permitted for public key authentication (PubkeyAuthentication) in SSH protocol 2.0.

Which is pretty self explanatory, so that’s what the key difference in the files were originally, authorized_keys for RSA in SSH 1.3 and 1.5 and authorized_keys2 for 2.0

What is the difference between authorized_keys and authorized_keys2?

However, that’s from releases of OpenSSH earlier than 3.0, which was released in 2001, a long time ago.. looking back at the OpenSSH 3.0 release announcement authorized_keys2 is now actually deprecated. We should all just be using authorized_keys instead from now (er, 2001..) onwards!

sudo: sorry, you must have a tty to run sudo

sudo: sorry, you must have a tty to run sudo We’re using an old version of Upstart, on Centos, to manage stopping and starting our Node.js daemons, and one of the things the script does, like any good deamon, is change the user of the deamon process from root to something more applicable, security and all that 😉

The scripts look a little like this


!upstart
description "Amazing Node.js Daemon"
author "idimmu"

start on runlevel [2345]
stop on shutdown

env PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
env NAME=”amazing-daemon”

script
export HOME=”/root”
cd /opt/idimmu/$NAME
echo $$ > /var/run/$NAME.pid
exec sudo -u idimmu /usr/bin/node /opt/idimmu/$NAME/server.js >> /var/log/$NAME/stdout.log 2>&1
end script

pre-start script
echo “[`date -u +%Y-%m-%dT%T.%3NZ`] (upstart) Starting $NAME” >> /var/log/$NAME/stdout.log
end script

pre-stop script
rm /var/run/$NAME.pid
echo “[`date -u +%Y-%m-%dT%T.%3NZ`] (upstart) Stopping $NAME” >> /var/log/$NAME/stdout.log
end script

Which is nice, as it means we can use Upstart to stop/start/status deamons really nicely. The equivalent init.d script looked really horrible.

But there’s one massive caveat, which we always encounter when building a brand new box, from scratch.


2013-09-27T10:50:10.174Z] (upstart) Starting amazing-daemon
sudo: sorry, you must have a tty to run sudo

sudo: sorry, you must have a tty to run sudo

So it all falls apart due to the following error:

sudo: sorry, you must have a tty to run sudo

Basically sudo is stopping the process from running because Upstart doesn’t have a TTY. This is easily fixable. Just edit /etc/sudoers using visudo and comment out


Defaults requiretty

i.e.


#Defaults requiretty

Now we can use Upstart to start the daemon and check it’s status to confirm it’s running! More recent versions of Upstart don’t need this hack. One day I’ll live in the future, but not today.


deploy:amazing root$ start amazing
amazing start/running, process 3965
deploy:amazing root$ status amazing
amazing start/running, process 3965

Bamo, problem solved!

pv – Pipe Viewer – My New Favourite Command Line Tool

Pipe Viewer I’ve got a rather large dataset that I need to do a lot of processing on, over several iterations, it’s a 20gb zip file, flat text, and I’m impatient and don’t like not knowing things!

My new favourite Linux command line tool, pv (pipe viewer) is totally awesome. Check this out:

 
 
 
 


pv -cN source < urls.gz | zcat | pv -cN zcat | perl -lne '($a,$b,$c,$d) = split /\||\t/; print $b unless $b =~ /ac\.uk/; print $c unless $c =~ /ac\.uk/' | pv -cN perl | gzip | pv -cN gzip > hosts.gz
zcat: 93.4GiB 1:33:18 [26.6MiB/s] [ <=> ]
perl: 85.7GiB 1:33:18 [25.3MiB/s] [ <=> ]
source: 13.2GiB 1:33:17 [3.57MiB/s] [===============================================> ] 67% ETA 0:44:41
gzip: 12.7GiB 1:33:18 [3.51MiB/s] [ <=> ]

I’m basically splitting some text, removing stuff I don’t want and doing:


zcat urls.gz | perl -lne '($a,$b,$c,$d) = split /\||\t/; print $b unless $b =~ /ac\.uk/; print $c unless $c =~ /ac\.uk/' | gzip > hosts.gz

But at appropriate moments I’ve piped the output in to the pv pipe viewer tool to report on some metrics. FYI the -N flag lets me set a name for the pv instance, and the -c flag is to enable cursor positioning so we can use multiple instances of pv!

The reason pipe viewer is totally cool is the extra sneaky data we get!

Pipe Viewer Is Magic

Because the first instance of pv is reading our urls.gz file in itself, it can display how much of the file it’s processed and roughly how long it will complete. MOST USEFUL THING EVER! Also I had no idea how large the compressed dataset was and was hesitant to uncompress the data as I wasn’t sure how big it would be, we can see from the pv instance named zcat that zcat has so far spat out 93.4GB of data, at 67% through we can predict this file is probably around 140GB if we extract it. How cool is that? We can also tell from the pv named perl that after splitting and removing the data we don’t want, we’ve so far shaved off 10GB, which is kinda interesting to splurge over for a bit, and lastly with the named gzip pv instance, pipe viewer is telling us the size of the output file we’ve generated so far.

This is totally rad.

Note. Many thanks to Norway for forcing me to rewrite my initial one liner of


zcat urls.gz | sed 's/|/ /g' | while read a b c d ; do echo $b ; echo $c ; done | grep -v ac.uk$ | gzip > hosts.gz

by glaring at me.

Enable Linux Core Dump

Enable Linux Core Dumps One of our applications (Freeswitch) just randomly crashed for no apparent reason and didn’t write anything to it’s log files. The service we’re trialling is currently in Beta so there’s room to muck about and do some diagnostics. I want to make the kernel dump a core file whenever Freeswitch dies, in case it happens again, so that we have some stuff to work with after the fact. It’ll also shut up my QA manager.

Check The Current Linux Core Dump Limits

ulimit is used to specify the maximum size of generated coredumps, this is to stop apps chewing up a million GB of RAM and then blowing your disk up, by default it’s a 0, which means nothing gets written to disk and no dump is created!


hstaging:~ # ulimit -c
0

Change The Linux Core Dump Limits To Something Awesome

To set the size limit of the linux core files to 75000 bytes, you can do something like this


hstaging:~ # ulimit -c 75000
hstaging:~ # ulimit -c
75000

but I’m a maverick, this does exactly what you think it does


hstaging:~ # ulimit -c unlimited
hstaging:~ # ulimit -c
unlimited

Enable Linux Core Dump For Application Crashes And Segfaults And Things

Ok, so we want this to persist across reboots so that basically means we have to stick the ulimit command in /etc/profile, i’m putting this at the bottom of mine:


#corefile stuff
ulimit -c unlimited > /dev/null 2>&1

this will stop anything weird getting spat out to the screen and nicely tells us that it’s core file stuff.

For our next trick we’ll set some sysctl flags so in /etc/sysctl.conf add


#corefile stuff
kernel.core_uses_pid = 1
kernel.core_pattern = /tmp/core-%e-%s-%u-%g-%p-%t
fs.suid_dumpable = 2

this basically says when the application crashes create a coredump file in /tmp with a useful name pattern


kernel.core_uses_pid = 1 - add the pid of the crashed app to the filename.
fs.suid_dumpable = 2 - enable linux core dumps for setuid processes.
kernel.core_pattern = /tmp/core-%e-%s-%u-%g-%p-%t - crazy naming pattern for a successful core dump, here's roughly what all the bits mean:
%e - executable filename
%s - number of signal causing dump
%u - real UID of dumped process
%g - real GID of dumped process
%p - PID of dumped process
%t - time of dump (seconds since 0:00h, 1 Jan 1970)

super usefuls. Then run sysctl -p so it takes effect yo!


hstaging:~ # sysctl -p

kernel.core_uses_pid = 1
kernel.core_pattern = /tmp/core-%e-%s-%u-%g-%p-%t
fs.suid_dumpable = 2

Enabling Linux Core Dump For All Apps

Now here’s the last part. When you want an application to core dump you create an environment variable, before you start it, telling the kernel to sort itself out and get ready to dump, if you want all apps on the server to generate core dumps then you’re going to want to specify this variable somewhere near the top of the process chain. The best place for this on a redhat style box is /etc/sysconfig/init, so stick the following in that file


DAEMON_COREFILE_LIMIT='unlimited'

now might be an idea to reboot to force it to be set across all applications and things

Enabling Linux Core Dumps For A Specific Application

This is the slightly less rebooty version of the above. Rather than force the environment variable to be loaded when the box starts, we just stick it in the init script for the deamon, and then restart the daemon.

In /etc/init.d/functions the RedHat guys have already stuck in


corelimit="ulimit -S -c ${DAEMON_COREFILE_LIMIT:-0}"

So we need to make sure we put our DEAMON_COREFILE_LIMIT above that. Simples. In our case it’s in /etc/init.d/freeswitch with


DAEMON_COREFILE_LIMIT='unlimited'

Distros That Aren’t RedHat

DAEMON_COREFILE_LIMIT is a RedHatism. If you’re running something cool, like Ubuntu, you’ll want to use


ulimit -c unlimited >/dev/null 2>&1
echo /tmp/core-%e-%s-%u-%g-%p-%t > /proc/sys/kernel/core_pattern

instead.

Testing Core Dumps

This is EASY, we just start the deamon, send a segfault signal, look in the right place!!


hstaging:tmp # /etc/init.d/freewitch start
hstaging:tmp # /etc/init.d/freeswitch status
freeswitch (pid 8257) is running...
hstaging:tmp # kill -s SIGSEGV 8257
hstaging:tmp # ls /tmp/core*
core-freeswitch-11-493-492-8257-1371823178

Now you give this file to your developers and take a bow!

CouchDB {“error”:”insecure_rewrite_rule”,”reason”:”too many ../.. segments”}

couchdb

Whilst working an AMAZING NPM repository mirror yesterday (which totally works, despite not really offering the performance benefit I’d hoped, because NPM is rubbish) I came across this error whilst doing things

 
 
 
 
 
 
 
 
 
 


16 http GET http://localhost:5984/registry/_design/app/_rewrite/-/all/since?stale=update_after&startkey=1371737164294
17 http 500 http://localhost:5984/registry/_design/app/_rewrite/-/all/since?stale=update_after&startkey=1371737164294
18 error Error: insecure_rewrite_rule too many ../.. segments: registry/_design/app/_rewrite/-/all/since
18 error at RegClient. (/root/.nvm/v0.8.15/lib/node_modules/npm/node_modules/npm-registry-client/lib/request.js:259:14)
18 error at Request.init.self.callback (/root/.nvm/v0.8.15/lib/node_modules/npm/node_modules/request/main.js:120:22)
18 error at Request.EventEmitter.emit (events.js:99:17)
18 error at Request. (/root/.nvm/v0.8.15/lib/node_modules/npm/node_modules/request/main.js:648:16)
18 error at Request.EventEmitter.emit (events.js:126:20)
18 error at IncomingMessage.Request.start.self.req.self.httpModule.request.buffer (/root/.nvm/v0.8.15/lib/node_modules/npm/node_modules/request/main.js:610:14)
18 error at IncomingMessage.EventEmitter.emit (events.js:126:20)
18 error at IncomingMessage._emitEnd (http.js:366:10)
18 error at HTTPParser.parserOnMessageComplete [as onMessageComplete] (http.js:149:23)
18 error at Socket.socketOnData [as ondata] (http.js:1367:20)
19 error If you need help, you may report this log at:
19 error
19 error or email it to:
19 error

Visiting that URL in a web browser gave me


{"error":"insecure_rewrite_rule","reason":"too many ../.. segments"}

This is because secure rewrites are enabled! Looking at my couchdb config this occured in the default.ini


secure_rewrites = true

so in the [http] segment in the local.ini file i set it to false, in your face security model!


secure_rewrites = false

Then i restarted couchdb, and the world was put to rights and the error went away.

Think Carefully About Your Clever Project Names

Idiot We’re building a new exciting cluster at work using Linux HA and stuff to make it work magically.

In the olden days of yore RedHat and co were using Pacemaker with the old crm (Cluster Resource Manager / Cluster Relationship Manager, pick one..) tool for cluster management, which was nice. Now it looks like RedHat have removed the crm command from their repositories and have switched to PCS which stands for either Pacemaker/Corosync Configuration System or is the plural of PC (Personal Computers).

This makes searching for anything useful on the Internet quite difficult.

PCS Clustering

It would have been better to go with a random pronouncable alpha-character string non word, than the stupid PCS acronym, and probably get you some delicious Web 2.0 VC!

Installing Magento

Installing Magento One of my clients wanted an E-Commerce solution for his website and after a little bit of analysis we opted for the community edition of Magento.

We wanted something based on the usual LAMP stack, that was OpenSource so it could be extended, was free as in beer, had great international support, was fully featured and enterprise ready, could pass PCI DSS compliance, allowed reasonable payment gateway options, could scale and was easy to extend as well as backup. Also we wanted complete control of the deployment, rather than integrating with a 3rd party cloud service provider such as Shopify to keep costs down and retain flexibility.

Installing Magento

With out going in to the specifics of configuring an Apache VirtualHost or installing MySQL here’s a rough guide on how to install Magento.

Download Magento from their download page. Always opt for the latest version as it includes important security fixes. You will need to create an account on their site for this.

Create a new MySQL database for the installation and note it’s credentials for later use.

Extract the Magento archive to your document root.

You will need to set write permissions for the web server to write to the following files and directories.

  • var
  • var/.htaccess
  • app/etc

You can either chmod 777 or get a little cleverer about ownership. e.g.


sudo chmod -R 777 var var/.htaccess app/etc

Assuming Apache is correctly configured you can now use your web browser to visit the install directory at the URL you’ve installed Magento at and follow the online install guide. e.g. http://www.shopidimmu.net/, then the wizard will be located here: http://www.shopidimmu.net/install/.

Choosing A Credit Card Processor

After we’d correctly installed and configured Magento, I left it to the web team to get on with populating it with products and get it to look pretty but we still had to find the best credit card processor to accept payments as we didn’t want to use Paypal. We also wanted someone who offered physical solutions to accept card payments with chip and pin.

After checking out some reviews for Merchant Warehouse and Charge.com we decided to opt with Merchant Warehouse as they offer really easy integration with Magento with their MerchantWare plugin.

Installing the Magento Connect MerchantWare plugin was trivial and just required copying and pasting our extension key in to the Magento Connect Manager and clicking install.

monit: error connecting to the monit daemon

monit: error connecting to the monit daemon We’re rolling out monit on our new platform at the request of a vendor to manage their new service. I’ve always been dead against these kinds of automated failure recovery tools as they often require human intervention after the fact anyway and all the platforms I’ve managed will have failed the server anyway so why not restart the services after the root cause analysis is done? My tune is slowly changing though and I’m coming to appreciate this method of systems recovery a lot more.

Whilst playing with it though I got the following error


root@newshiny:~# monit summary
monit: error connecting to the monit daemon

What what? The daemon’s definitely running, why can’t I pole it’s status?


root@newshiny:~# ps aux | grep monit
root 325293 0.0 0.0 16440 1276 ? Sl Mar27 0:10 /usr/bin/monit
root 496627 0.0 0.0 105348 832 pts/0 S+ 11:43 0:00 grep monit
root@newshiny:~# service monit status
monit (pid 325293) is running...

After reading the documentation this monit: error connecting to the monit daemon seemed to be an epic case of rushing in to things, skimming the documentation and PEBCAK!

Solving monit: error connecting to the monit daemon

Monit can present an HTTP interface which I didn’t enable as I thought it was just for me, it turns out it’s also for the command line tools!

It’s really easy to enable, in /etc/monit.conf or wherever your conf file is located just add


set httpd port 2812 and
use address localhost
allow localhost

and restart monit with


service monit restart

and Bob’s your mother’s brother.


root@newshiny:~# netstat -lpn | grep 2812
tcp 0 0 127.0.0.1:2812 0.0.0.0:* LISTEN 325293/monit


root@newshiny:~# monit summary
The Monit daemon 5.2.5 uptime: 19h 18m

Process 'shiny_manager' running
Process 'shiny_proxy' running
Process 'shiny_server' running
System 'system_newshiny' running

Problem solved!