Mange / roadie

Making HTML emails comfortable for the Ruby rockstars

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Server aborting

ogtfaber opened this issue · comments

We're getting this error occasionally:

ruby(5772,0x102800000) malloc: *** error for object 0x8: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
zsh: abort      bundle exec rails s

It happens when executing this line:
compiled_html = Roadie.inline_css(Roadie.current_provider, %w(reports), html, {})

Meaning that If we remove that line, the crashes don't occur. It can happen doing the same request a couple of times.

Seems like a problem with Nokogiri on your server. Check to see if it's the latest version installed and check if there's a newer version of libxml or libxslt available.

I'm no C developer, but I think that error could be caused by memory not being properly allocated; perhaps your server process is running out of memory?

I had this same issue and it appears to be a problem with certain versions of libxml and nokogiri. Following the instructions here fixed it for me.

Thanks for the heads-up @AaronH. I'm closing this for now while I wait for @ogtfaber to get back in touch.

Solved it somehow.. I had the same problem on my machine and tried every solution that was mentioned, but no luck.

Setup: OSX Lion 10.7.3, homebrew 0.9, RVM (latest), ruby 1.9.3-p0, rails 3.2.2, nokogiri 1.5.2, roadie 2.3.1

Errors: Segfaults, 'malloc: *** error for object 0x7fc187660008: pointer being freed was not allocated'
Warning: 'WARNING: Nokogiri was built against LibXML version 2.7.8, but has dynamically loaded 2.7.3'

For some reason, the warning and errors went away when I explicitly required nokogiri in my application.rb. I don't understand why though, but it might help others.

Update: Interestingly, it seems that the require only helps when I add it before the 'Bundler.require' call in application.rb

Update2: Narrowing it down, it turned out that when I removed the explicit require and added roadie in my Gemfile before the 'pg' (postgres) gem (v0.11), the error disappeared. So, there seems to be something in the pg gem that must have caused the issues I was experiencing.

Ok guys, sorry for the absence, I've reinstalled nokogiri as @AaronH suggested, and it works. The instructions from @moiristo don't work for me...

Thanks for all the suggestions!

A colleague of mine had the same problem and reinstalling as @AaronH suggested worked for him as well. I don't know what's different in my setup that causes the pg gem to conflict with this gem, but I though it was worth mentioning anyway for others to try.

commented

We were encountering the same issue and the gem ordering @moiristo suggested worked for us. Also tried the solution @AaronH provided first, so maybe both are necessary when working with PostgreSQL

I hope you are aware that the order gems appear in the Gemfile does not directly affect the order they are included in the end. The order should be considered random, and if reordering worked for anyone, the problem might come back later.

Why should it be considered random? In ruby 1.9 it shouldn't be a problem if a hash is built in the loading process.

Bundler does not guarantee any specific order. In practice, it is deterministic as long as the version of Bundler doesn't change and you're using the same Gemfile.lock, but you are supposed to consider it random.

The reason for this is that gems have dependencies and Bundler will have to jump around in the file and include a "later" gem as a dependency.

Look at Gemfile.lock. The gems are in alphabetical order in there. The order in the Gemfile will not really matter.

I can get this to happen in this situation:

gem 'roadie'

and using Devise.... when I try to do a password reset the server crashes. I doubt this is useful. I just removed the gem for now.

I doubt this is useful.

You are correct. If you want me to take a look at it you need to tell me more information.

Interesting. I have never had this problem until I install roadie... I take roadie out and the problem goes away. Whatever is causing it, it's coming from roadie...

So, for now, I can't use roadie...

Was Nokogiri part of your gem bundle before, or was it added by roadie?

It's always been there. Adding roadie is what started the problems. I use pg as well...no interaction issues at all until I added roadie. Ordering of gems doesn't seem to have any effect.

Lee

Lee Atchison
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

On Sunday, July 22, 2012 at 1:30 AM, Magnus Bergmark wrote:

Was Nokogiri part of your gem bundle before, or was it added by roadie?


Reply to this email directly or view it on GitHub:
#17 (comment)

Okay, so either Roadie is using something naughty in Nokogiri that causes it to crash (threads, hidden API:s, wrong API calls) or nokogiri crashes because it touches more things (only parsed XML before, only generated XML instead of parsing, etc.).

Does it appear to crash randomly or is it crashing on the same stuff over and over?

It crashes "pretty" reliably. It won't sometimes, but almost always. It crashes during email sending when it does crash.

Lee

Lee Atchison
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

On Sunday, July 22, 2012 at 10:09 AM, Magnus Bergmark wrote:

Okay, so either Roadie is using something naughty in Nokogiri that causes it to crash (threads, hidden API:s, wrong API calls) or nokogiri crashes because it touches more things (only parsed XML before, only generated XML instead of parsing, etc.).

Does it appear to crash randomly or is it crashing on the same stuff over and over?


Reply to this email directly or view it on GitHub:
#17 (comment)

So it's always crash if you sent a particular email like 20 times? Perhaps you could strip out things until I can get some sort of test case..?

It doesn't matter the email. It could be completely blank and it will fail. it's also not one failure in 20, but more like one success in 20.

Lee

Lee Atchison
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

On Sunday, July 22, 2012 at 10:53 AM, Magnus Bergmark wrote:

So it's always crash if you sent a particular email like 20 times? Perhaps you could strip out things until I can get some sort of test case..?


Reply to this email directly or view it on GitHub:
#17 (comment)

In that case I'd wager that there's something wrong with Nokogiri. You should try to use Nokogiri from the console. Get a new HTML document and start searching for different CSS selectors.
You could also fork the gem and modify it to print debug messages all the time and see where it crashes. I think it'll be in different places every time.

Sorry, but I don't buy it. I've been using Nokogiri in various places for years…never seen this problem. I've used many combinations of gems and modules in various versions of rails and have never seen an error like this. I add roadie, and all hell breaks lose. It's something that roadie is doing.

I wanted to try using roadie, but it's causing too many problems, so I pulled it…not using it anymore. I'm sorry, but I just can't take the chance…it's not stable enough…

Lee


Lee Atchison | lee@leeatchison.com

On Sunday, July 22, 2012 at 10:45 PM, Magnus Bergmark wrote:

In that case I'd wager that there's something wrong with Nokogiri. You should try to use Nokogiri from the console. Get a new HTML document and start searching for different CSS selectors.
You could also fork the gem and modify it to print debug messages all the time and see where it crashes. I think it'll be in different places every time.


Reply to this email directly or view it on GitHub:
#17 (comment)

Sure, but roadie is doing a lot when it's working on an email. It needs to parse the tree and then scan for all elements matching all your CSS selectors and then change them and then render it all back to HTML again.

I'm not sure how I could use Nokogiri so that the C runtime crashes. Either Nokogiri is not guarding against something, or the error is in C land. As far as I know I'm using the normal public API and some of the most common methods in there, it's just to a great volume that would exercise C land enough to bring any issues to light, if present.

I'm open to the suggestion that I'm doing something wrong, but I need someone to come with a suggestion on what to look for. I'm sorry I couldn't help you more.

Still keeping this as a WORKSFORME.

Having this exact same problem, removing Roadie makes everything work just fine. I also think it's a cop-out to just say "works for me" and closing it, so I'm going to fork it and try to solve the issue without taking the lazy way out. @leeatchison I'll keep you posted.

Thank you!

Lee Atchison
lee@leeatchison.com

On Tue, Jul 24, 2012 at 12:44 PM, Gerred Dillon <
reply@reply.github.com

wrote:

Having this exact same problem, removing Roadie makes everything work just
fine. I also think it's a cop-out to just say "works for me" and closing
it, so I'm going to fork it and try to solve the issue without taking the
lazy way out. @leeatchison I'll keep you posted.


Reply to this email directly or view it on GitHub:
#17 (comment)

Same here. Added roadie to the Gemfile and it starts crashing.
Capybara already had a nokogiri dependency before…

Uninstalled all nokogiri versions, installed again. no change, still crashing.

Exception Type:  EXC_BAD_ACCESS (SIGABRT)
Exception Codes: KERN_INVALID_ADDRESS at 0x00000000000074ae

VM Regions Near 0x74ae:
--> 
    __TEXT                 00000001029b5000-0000000102b92000 [ 1908K] r-x/rwx SM=COW  /Users/USER/*

Application Specific Information:
objc[80657]: garbage collection is OFF
abort() called

Thread 0 Crashed:
0   libsystem_kernel.dylib          0x00007fff8c75fce2 __pthread_kill + 10
1   libsystem_c.dylib               0x00007fff8ba587d2 pthread_kill + 95
2   libsystem_c.dylib               0x00007fff8ba49a7a abort + 143
3   ruby                            0x00000001029e9ac9 rb_bug + 185
4   ruby                            0x0000000102a8dfaf sigsegv + 79
5   libsystem_c.dylib               0x00007fff8baaacfa _sigtramp + 26
6   libxml2.2.dylib                 0x00007fff8bc257e5 xmlSetListDoc + 18
7   libxml2.2.dylib                 0x00007fff8bc25859 xmlSetTreeDoc + 85
8   libxml2.2.dylib                 0x00007fff8bc257f6 xmlSetListDoc + 35
9   libxml2.2.dylib                 0x00007fff8bc25859 xmlSetTreeDoc + 85
10  libxml2.2.dylib                 0x00007fff8bc257f6 xmlSetListDoc + 35
11  libxml2.2.dylib                 0x00007fff8bc25859 xmlSetTreeDoc + 85
12  libxml2.2.dylib                 0x00007fff8bc2d239 xmlAddChild + 181
13  nokogiri.bundle                 0x00000001031823d8 dealloc_node_i + 56 (xml_document.c:14)
14  ruby                            0x0000000102a96073 st_foreach + 451
15  nokogiri.bundle                 0x0000000103182363 dealloc + 51 (xml_document.c:30)
16  ruby                            0x00000001029fe260 finalize_list + 192

OSX 10.7.4, Ruby 1.9.3-p194

@rmoriz The stack makes it appear like Nokogiri is trying to deallocate nodes that are already deallocated. My guess is that there's some cyclic references somewhere, Nokogiri/libxml has a bug or the GC is acting up.

Can you get this to crash in a test? Use the same template and stylesheet in a new Rails app, for example.

@Mange It's an already existing app with rather complex dependencies. I added the roadie gem and rerun my test suite which produces the exception mentioned above. Interestingly enough it seems to work on a fresh 10.8.0 system and on 10.7 with custom libxml2+libxslt from homebew (and a reinstalled nokogiri of course).

Wild speculation: This might be an issue with the os versions of libxml2/libxslt that come with 10.7.4

Yeah, seems like OS X always have broken versions of those libraries. I expect Roadie to crash on those.

Let me know if you find anything else. :-)

I want more people to test the fork in pull #32. Does it work for you?

You need more people to test Object#dup ? Are you trolling?

Most people deploy to Linux machines while that change tries to fix problems on Mac OS X when the user forgets to ditch the buggy builtin versions of libxml and libxslt.
Of course I want to see if this fix works for the Linux users too.

No one has been able to supply a test for me to run so I cannot test it myself.

Still no one? Some of you were screaming for a fix and when Josh and Ingemar offer one that might fix this issue no one want to try it?

Just fyi, I recently upgraded OSX to mountain lion and the latest version of xcode and I noticed just now that the gem doesn't segfault anymore. I don't have a linux machine though, so unfortunately I can't help you.

Thanks anyway. I'm glad Apple fixed their libraries. :-)

Still bugging on Lion. :(

Did you recompile libxslt and libxml2 after upgrading to Lion?

It worked after I did brew install libxml2 libxslt then installing nokogiri gem

That's because we all gave up on using the Gem… I'm not interested in a fix anymore, I've moved on…

Lee


Lee Atchison | lee@leeatchison.com

On Aug 9, 2012, at 11:47 PM, Magnus Bergmark notifications@github.com wrote:

Still no one? Some of you were screaming for a fix and when Josh and Ingemar offer one that might fix this issue no one want to try it?


Reply to this email directly or view it on GitHub.

I've solved this issue on Snow Leopard by installing libxml2 and libxslt via homebrew, and building nokogiri against those libraries, instead of those provided by the OS.

This post details the necessary steps.

@asymmetric, your solution worked for me.

This does seem to confirm that this a nokogiri/Lion-related issue.

The fix from pull #32 which was introduced in 2.3.3 worked for me on Linux. Thanks!