Help - Search - Members - Calendar
Full Version: Render Failed error
Hash, Inc. Forums > Technical Direction and Development (Learning Animation:Master) > A:M Rendering, Compositing and Special Effects > Netrender
Far Star Productions
I just started doing some rendering with v13 and I am getting a odd render failed error at times on render nodes.
Anyone else getting these? render failed errors
rickh
QUOTE(Far Star Productions @ Dec 8 2006, 01:58 PM) *

I just started doing some rendering with v13 and I am getting a odd render failed error at times on render nodes.
Anyone else getting these? render failed errors


I don't get this happenning regularly, but I typically have just three render slaves. I think you have much more.

What is the exact error message from the error window?

Do the errors happen at the same point in a render or at random points?

Are the Slaves banned after an error (which is common if it is a networking error).

What value do you have for the "Launch Throttle Delay" setting in RenderServer?

What operating system does the PC have which stores the Netrender Project?

Do you have a Domain Server network or a Workgroup network?

How many RenderSlaves do you have?

What release of V13 have you got (some of the recent versions had Netrender bugs)?

Have you installed the V14 alpha or alpha1 NEt2006.exe installs on any of your Netrender PC's?

The reason I ask the last question is that these two releases had a bug where they put their Netrender settings in the same registry locations as V13. Some of your nodes might be running V14 alpha rather then V13.

This bug has apparently been fixed in the latest V14 Alpha2. but I have not tried this yet.

Richard Harrowell.
Far Star Productions
Let me first add before going any further some additional information. I have been rendering in v12 and all has been great beside the already know bugs. I have done simple animation test with v13 but today was the fist were we put it to the test. I am rendering out scenes in v13 N with same models and same settings as we have been doing in v12.
I am rendering out with 10 nodes on this project. Since may last post I am now seeing frames on the Intel machines taking much longer than the AMD Atholons. Atholons 12 minutes Intels 2 hours. I have never seen this big of a difference in render times.

To answer your questions

What is the exact error message from the error window?

in the project area were it says errors I will see a number in red. I right click on project to open dialog box that says show errors. I click on that and all it says in Description is "Render Failed" and what machine.

Do the errors happen at the same point in a render or at random points?

Random points and some times the machine will start rendering fine again if I let it just keep failing.

Are the Slaves banned after an error (which is common if it is a networking error).

Slaves are not banned

What value do you have for the "Launch Throttle Delay" setting in RenderServer?

10

What operating system does the PC have which stores the Netrender Project?
server 2003 enterprise edition

All are on the server. There are two drives on the server One drive has models and projects the other is were the Rendered frames are saved out to.

Do you have a Domain Server network or a Workgroup network?

workgroup

How many RenderSlaves do you have?

15 put can not run them all I run into issues after 11 are rendering i get banned slaves do to not able to save to path error. I Looked into it and looks like I am needing to purchase something for server so it will allow more. I am rendering with 10 at this time.

What release of V13 have you got (some of the recent versions had Netrender bugs

v13 N

Have you installed the V14 alpha or alpha1 NEt2006.exe installs on any of your Netrender PC's?

no


Thank you for time on this it is greatly appreciated.









rickh
Try it with "Launch Throttle Delay" set to 1 and see what happens (or even zero).

Launch Throttle Delay is a new feature that happenned around V13h and it has some bugs.

Obviously your V12 did not have this feature at all. A setting of zero essentially disables the Launch Throttle Delay.

Richard Harrowell.
rickh
QUOTE(Far Star Productions @ Dec 8 2006, 08:21 PM) *

What operating system does the PC have which stores the Netrender Project?
server 2003 enterprise edition

All are on the server. There are two drives on the server One drive has models and projects the other is were the Rendered frames are saved out to.

Do you have a Domain Server network or a Workgroup network?

workgroup

How many RenderSlaves do you have?

15 put can not run them all I run into issues after 11 are rendering i get banned slaves do to not able to save to path error. I Looked into it and looks like I am needing to purchase something for server so it will allow more. I am rendering with 10 at this time.


I may be able to overcome your Slave limit problem.

First some background.

There are two different but similar problems on a Workgroup network.

The first I am sure you are aware of - A Windows PC in a workgroup will only allow connection to network resources to ten other PCs - it will refuse any further attempt at connections.

So if you have 15 pc's in a pool and you enable the project, then 15 slaves try and connect to the server. 4 cannot get a connection and RenderServer quickly bans them from the network.

If you changed to Windows Domain network using your server, this particular problem would be solved, but then you would hit the second problem - All Windows PC's (including servers) only allow a maximum of 10 TCP/IP connections at a time. Yes, the PC's will accept more then 10 connections but the excess are queued - at any instant there are only 10 active connections.

Again in the pool start-up scenario, some slaves might find themselves queued long enough for a timeout to occur resulting in a banned PC again. You can get a unofficial patch to overcome this limit, but don't. It is a bad solution, and Microsoft removes the patch with Windows updates anyway.

The solution is to use a Linux/FreeBSD machine acting as a Workgroup share to hold both the A:M project files and also act as the render destination. Both limit problems are solved completely.

Now the good thing is that you can do this in a very robust way without needing to spend a cent.

Since you have a server, go to the VMWare site and download the free VMWare server for Windows (it will only run on a windows server - not on XP pro, Win2000, etc). From the same site, also download a free appliance called "FreeNAS" which is a full NAS Network/Workgroup drive system which you could run on a seperate PC but in this case, it will run as a virtual PC on your server.

I have probably said enough for now, but I can give you more details if you like. It is all surprisingly easy.

Just to finish, this solution will give you a new networked PC (a virtual one) on your network that will be totally happy working with your other Workgrouped PC's with the difference that it will accept any number of connections without blinking. FreeNAS uses FreeBSD - a free Unix descendant that is much loved by the hardcore networking experts - and it will happily allow a massive number of concurrent TCP/IP connections without queueing.

Just copy the rendering project files to the FreeNAS PC on the network, set the rendering output address to the same FreeNAs PC, and you now should be able to run any number of render slaves. (Up to the Netrender Limit - is it 50 for the standard Dongle?).

Richard Harrowell.
Far Star Productions
thanks for all of the info! I will have my co producer that handels this stuff to look into it. It sound like the way to go. Thanks
rickh
QUOTE(Far Star Productions @ Dec 10 2006, 09:29 AM) *

thanks for all of the info! I will have my co producer that handels this stuff to look into it. It sound like the way to go. Thanks


If you have an old un-needed PII or PIII computer with a 20 to 40GBytes drive, then the other option is to install FreeNas (or an alternative is OpenFiler) on it and locate the render files there.


http://www.freenas.org
http://www.openfiler.com

I think many find FreeNAS the better package - leaner and meaner. The main rule with both is "Never Use a Software Raid Driive".

Did the change I recommended to the "Launch Throttle Delay" setting fix your current problem?

Richard Harrowell.
Far Star Productions
No changing the throttle delay did nothing.
v13 sucks at this point
Far Star Productions
Now all machines are getting the error. Meaning no progress in the rendering farm. Not good.
Its got to be my set up and not v13. I think not. It is v13
How long do you think it will take the Two project to start getting there face kicked in by this?

I watched one of the render nodes on the project load and go itno rendering and waited to see it fail and just before it did it flashed a error loading string 5923.
Far Star Productions
I filed a report.
To bad v13 has failed its test once again and so late in the game but I am sure they V14 will be great. Basicly saying v13 has not been a version the far star project has been able to use.
rickh
QUOTE(Far Star Productions @ Dec 10 2006, 01:28 PM) *

Now all machines are getting the error. Meaning no progress in the rendering farm. Not good.
Its got to be my set up and not v13. I think not. It is v13
How long do you think it will take the Two project to start getting there face kicked in by this?

I watched one of the render nodes on the project load and go itno rendering and waited to see it fail and just before it did it flashed a error loading string 5923.


I just tried V13 netrender on my system and it is working fine. I just put the latest version (13o) to check that there is not some new problem.

Have you tried rendering Toys? - if that doesn't render, then it woud seem your istallation has a problem.

Have you tried converting the V12 project to V13 before running Netrender?

Perhaps we are doing things differently. For a start, I have A:M installed on every PC rather then a shared A:M program folder on a network drive. I render to either TGA or OpenEXR files - never to mov's or avi's. I typically have up to three render slaves.

I hope you get this sorted.

Richard Harrowell.
Far Star Productions
Yes v13 has been rendering for me as well but the simple fact is the render nodes are freaking out all over the place with this "failed render error" with no explanation for why and for no apparent reason. I am not sure why you feel inclined to think it is our net render set up that is causing the issue when my render farm set up is work in v12 and not that v13 is not performing its task as it should be.
I asked this question about failed render to see if anyone else is getting it before filing a report to try to get it moved up the ladder in priority but it once again looks like I am the lone ranger.


to answer your questions
all project are converted to v13
I have AM installed on all pc
all folders are on server
I render to targa and exr only.

Thanks once again for your time on this.



Far Star Productions
Here is a interesting find to add to the pile of v13. I tried to render the project with the AM software that net render is having a prolem with. It will render for about 15 frames and then I get a exception 10 and Am crashes.
Excepttion 10 was a ugly in v12 but Hash got it taken care of but it looks like it is back in v13.
ddustin
Wasn't exception 10 a memory thing?

Meaning the scene was too large and the machine ran out of memory.

Sorry to hear you are running into so many problems...

David
rickh
I do not know about the Exception 10 problem, but if David is correct and it is a memory problem, then there could be a memory leak that would show up in the performance window of taskmananer. If you look at the graph in Taskmanager, does the memory drop down to the same starting point for each frame or is it higher each time a new frame starts?

If there is a memory leak, it could be an A:M problem, or if you are using 3rd party plugins like materials, shaders, post effects, the problem could be in one of these plugins.

I have been running v13o netrender today to try and break it and it is working perfectly for me. I have tried a few different projects of different complexities, rendering to both TGA's and OpenEXRs.

Do you get the same problem happening if you try and render any of the projects from the A:M CD? If you do, then perhaps we can try exactly the same render and compare results.

Richard Harrowell.
Far Star Productions
David this is what I am starting to think as well. The projects are to large for v13 but were ok for v12 just like it was for from v11 to v12.
We use a lot of darktree materials do you guys use darktree materials?
Far Star Productions
Here is another odd finding. The project I am having problems in getting rendered is only with one camera angel the other camera angels are rendering ok. Still getting the failed render error but then the machine will jump on eventually and play nice with others.
rickh
QUOTE(Far Star Productions @ Dec 12 2006, 05:48 AM) *

David this is what I am starting to think as well. The projects are to large for v13 but were ok for v12 just like it was for from v11 to v12.
We use a lot of darktree materials do you guys use darktree materials?


There are some really great Dark tree textrues, but we tend to use them sparingly becau they are fairly slow. I might test out a Darktree render tonight and see how it goes.

I am wondering - V13 now comes with Darktree. I imaging you have the full Darktree package raher then just the free Simbiont package that I have been using. Could the Darktree ins
tallation be in conflict somehow with the V13 darktree?

For example, is the Simbiont2.atx in the Textures folder dated the same as the other Hash textures?

All V13 needs to run a Darksim texture is the texture definition files (.dsts, .dstc, etc). No Darktree program or texture files should be copied or installed into the V13 folder.
Far Star Productions
I am a old pro with darktree and all is in order on the system.
Dont you find it odd how other cameras are able to render fine and one of them cant?
Far Star Productions
found the issue that is causing the failed render its materials with Envrionment map. O0h ugly.
Setting up a project and sending it in to reports.

The Environment map was being used as reflective. This is only one that I tested that was repeatable but I am sure the other material Environment maps we are using such as specularity will cause the same issue.

reason the other cameras were able to render project is do to the fact that the model with environment map material was not in shot.
Far Star Productions
Here is yet another find on the environment map issue that causes the failed render error.
If you have a environment map with the global axis turn on it will not render if you have the global axis turn off the environment map will render.
All of our models with environment maps have the global axis turned on and these models render fine in the AM software renderer.
Far Star Productions
Ahh the render farm is cranking away as it should be.
Just turned of global axis on all environment map materials until hash gets the bug fixed.



rickh
That was excellent work to track that problem down. It takes a real effort, so thanks.

If some of the slaves are slow to start new renders, you might still have to reduce the "Launch Throttle Delay" setting, but otherwise it seems like you have got V13 working now.

Richard Harrowell.
Far Star Productions
Richard,
Yes this was a tricky one that is for sure. I am sure Hash will be able to fix this issue with environment map materials easily since it is working fine in AM.
I am not experiencing any slow to load render nodes so I have left the throttle set to the default of 10. Thanks for the heads up.
Far Star Productions
The issue with environment map materials has been resolved in v13 O.
I was using v13.N
The fast response from Noel at hash soon revealed he was not getting the issue in v13 O so I just tested it out with v13 O and the environment map issue with global axis turned on is working fine.
Lessoned learned on this one for me.
If one has a issues with software be sure you are on the current version. You never know what might be fixed with out you knowing it. biggrin.gif
KenH
It would be nice to skip right to the constructive resolution of the problem without the Hash conspiracy theories along the way. Perhaps you'll remember that next time. I'm glad you solved it. I appreciate you're under pressure to render your project.

On that note, how's the Far Star project going? Any website to look at?
martin
I don't know if this is any encouragement to you, Jack, but Hash is using our Render Farm extensively on TWO - starting last week. I don't know if any of the issues we encounter will be the same issues that you're having but hopefully they are, and we'll be able to duplicate and fix your problems.

Also, for your information, even though it may not seem that way to you, Will Pickering's only programming task for the past 6 months has been to chase around NetRender issues from A:M Reports. Obviously, the complexity of NetRender, or the difficulty in duplicating the issues, has made Hash seem unconcerned to you. Hopefully, Hash being so dependant on NetRender for TWO, will bring the required manpower to bear to make you feel like we're doing something.

Far Star Productions
Well Well Well look who has decided to come over to our side of the playground. cool.gif

Ken
Your right I will do my best not to do that anymore. I feel like I just got caught smoking. huh.gif
The Far Star project is doing very well thank you for asking.
There is a web site but for artists working on project only. The public will see no images of the movie until the movie is complete and a trailer is released.

Martin,
Great to see you and the TWO project in the net render trenches with us. It is truly a piece of mind that is for sure.
Sorry if I came of poorly and also un appreciative of Will's outstanding work on net render and the great gains net render has under gone in v13. I have a tendency to do that in my fits of passion. I am sure you can understand but still is no excuse.
I am very pleased to announce v13 net render is performing great for us!
I will be sure to keep the data coming your way as issues come up but will be sure to be using the current version next time.
You guys at Hash Rock and I love you all for it.





This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2013 Invision Power Services, Inc.