2014/05/05

"Unknown Job Id Error (15001)" in Torque (PBS)

Today I spent some time struggling with an little obscure error in work.

A job with dependencies on others jobs was impossible to queue in Torque (running in a virtual cluster created just for development and testing).

The damn job:

----------------------------------------
#!/bin/bash

#PBS -S /bin/bash
#PBS -N metgrid
#PBS -q batch
#PBS -l walltime=02:00:00
#PBS -l nodes=1:ppn=1
#PBS -o metgrid-oe.log
#PBS -j oe
#PBS -W depend=afterok:325.dadm:326.dadm

#bla bla bla

----------------------------------------

This error was reported in :

----------------------------------------
cat /var/spool/torque/server_logs/20140505
...
05/05/2014 17:55:00;0080;PBS_Server.4427;Req;req_reject;Reject reply code=15041(Job rejected by all possible destinations (check syntax, queue resources, ...)), aux=0, type=Commit, from xxx@dhead
05/05/2014 17:55:29;0010;PBS_Server.4427;Job;327.dadm;Exit_status=0 resources_used.cput=00:00:28 resources_used.mem=43356kb resources_used.vmem=96480kb resources_used.walltime=00:00:50
...

----------------------------------------


The dependency jobs running:

----------------------------------------
qstat -n:

dadm:
                                                                                  Req'd    Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
334.dadm                user        infiniba ungrib            19895     1      1    --   02:00:00 R  00:00:00
   dhead/0
335.dadm                user        infiniba geogrid           19901     1      1    --   02:00:00 R  00:00:00
   dhead/1

----------------------------------------

Then this surprised me:

dhead:~/swrf/test/testMetgrid$ qstat -f 340.dadm
qstat: Unknown Job Id Error 340.dadm.mydomain.cl


So, I was using an incomplete job id that the machine tried to repair adding the domain but even so it failed again...

user@dhead:~/swrf/test/testMetgrid$ qstat -f 340.dadm.mydomain.cl
qstat: Unknown Job Id Error 340.dadm.
mydomain.cl

Well, the domain was missing in the configuration of the server...

This explained a solution:

"check that the first name given in your hosts file for your server exactly matches the name given
in your server_name file in your torque configuration."

... and yes, it was about basic configuration :-P

The changes I made:

root@dadm:/var/spool/torque# cat /etc/hosts
127.0.0.1    localhost
192.168.24.38    dadm.mydomain.cl dadm


root@dadm:/var/spool/torque# cat /var/spool/torque/server_name
dadm.
mydomain.cl

Then I restarted the server and after queueing my job all went fine.


----------------------------------------
root@dadm:/var/spool/torque# qstat -n

dadm.mydomain.cl:
                                                                                  Req'd    Req'd       Elap
Job ID                  Username    Queue    Jobname          SessID  NDS   TSK   Memory   Time    S   Time
----------------------- ----------- -------- ---------------- ------ ----- ------ ------ --------- - ---------
352.dadm.mydomain.cl     user     infiniba ungrib            20605     1      1    --   02:00:00 R  00:00:04
   dhead/0
353.dadm.mydomain.cl     user     infiniba geogrid           20619     1      1    --   02:00:00 R  00:00:04
   dhead/1
354.dadm.mydomain.cl     user     infiniba metgrid             --      1      1    --   02:00:00 H       --

----------------------------------------

Finally, I just need to repair the shellscripts that build and configure the cluster.

2014/04/02

maven-release-plugin + VADDIN

Before releasing a mavenized web project plus Vaadin, several problems popped up because the modification of sources made by Vaadin compilations.


Finally the solution was to exclude the problematic part that always is regenerated if it does not exist anyway:

------------------------------------------------------------------------
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-release-plugin</artifactId>
    <version>2.5</version>
    <configuration>
        <providerImplementations>
            <svn>javasvn</svn>
        </providerImplementations>
        <autoVersionSubmodules>true</autoVersionSubmodules>
        <tagNameFormat>v@{project.version}</tagNameFormat>
        <checkModificationExcludes>
            <checkModificationExclude>${project.build.outputDirectory}/**/VAADIN/**</checkModificationExclude>
        </checkModificationExcludes>

        <configuration>
            <preparationGoals>clean verify</preparationGoals>
        </configuration>
    </configuration>
    <dependencies>
        <dependency>
            <groupId>com.google.code.maven-scm-provider-svnjava</groupId>
            <artifactId>maven-scm-provider-svnjava</artifactId>
            <version>2.1.1</version>
        </dependency>
    </dependencies>
</plugin>

------------------------------------------------------------------------

Enjoy!

2014/02/27

Unable to tag SCM with maven-release-plugin

An incomplete layout in your subversion repository may be one of the causes for this error:

mvn release:clean -e
mvn release:prepare --batch-mode -e
mvn release:perform -e



------------------------------------------------
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-release-plugin:2.4.2:prepare (default-cli) on project geoemissions: Unable to tag SCM
Provider message:
The svn tag command failed.
Command output:
svn: '/svn/sources/GeoEmissions/!svn/bc/91/tags' path not found

    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:213)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
    at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
    at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
    at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
    at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
    at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:409)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:352)

------------------------------------------------

Similar error with scm:

mvn scm:tag -Dtag="Test"

------------------------------------------------
[ERROR] Provider message:
[ERROR] SVN tag failed.
[ERROR] Command output:
[ERROR] svn: E160013: Path 'http://svn.geoaire.cl/svn/sources/YourProject/tags/v0.01' does not exist in revision 91

------------------------------------------------

My project was up and running in the trunk, what else was I going to need? Maybe tags and branches? :-)

From now and on I rather prepare a shellscript (that I version too) to build up my projects in the very beginning with this three important directories:


------------------------------------------------
WORKSPACE=/srv/svn/repositories/sources/YourProject
mkdir -p $WORKSPACE
svnadmin create --fs-type fsfs $WORKSPACE
chmod 770 -R $WORKSPACE
svn mkdir file:///srv/svn/repositories/sources/
YourProject/{trunk,tags,branches}
------------------------------------------------

After the doing the "svn mkdir", the two plugins worked well. This was my configuration:

 ------------------------------------------------
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-scm-plugin</artifactId>
                    <version>1.9</version>
                    <configuration>
                        <providerImplementations>
                            <svn>javasvn</svn>
                        </providerImplementations>
                        <connectionType>developerConnection</connectionType>
                        <username>xxx</username>
                        <password>yyy</password>
                    </configuration>
                    <dependencies>
                        <dependency>
                            <groupId>com.google.code.maven-scm-provider-svnjava</groupId>
                            <artifactId>maven-scm-provider-svnjava</artifactId>
                            <version>2.1.0</version>
                        </dependency>
                    </dependencies>
                </plugin>


                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-release-plugin</artifactId>
                    <version>2.4.2</version>
                    <configuration>
                        <providerImplementations>
                            <svn>javasvn</svn>
                        </providerImplementations>                  
                        <autoVersionSubmodules>true</autoVersionSubmodules>
                        <tagNameFormat>v@{project.version}</tagNameFormat>
                    </configuration>
                    <dependencies>
                        <dependency>
                            <groupId>com.google.code.maven-scm-provider-svnjava</groupId>
                            <artifactId>maven-scm-provider-svnjava</artifactId>
                            <version>2.1.0</version>
                        </dependency>
                    </dependencies>
                </plugin>

------------------------------------------------

Enjoy!