http://www.cnblogs.com/spork/archive/2010/04/21/1717592.html

 

经过上一篇的分析,我们知道了Hadoop的作业提交目标是Cluster还是Local,与conf文件夹内的配置文件参数有着密切关系,不仅如此,其它的很多类都跟conf有关,所以提交作业时切记把conf放到你的classpath中。

  因为Configuration是利用当前线程上下文的类加载器来加载资源和文件的,所以这里我们采用动态载入的方式,先添加好对应的依赖库和资源,然后再构建一个URLClassLoader作为当前线程上下文的类加载器。

public static ClassLoader getClassLoader() {
ClassLoader parent
= Thread.currentThread().getContextClassLoader();
if (parent == null) {
parent
= EJob.class.getClassLoader();
}
if (parent == null) {
parent
= ClassLoader.getSystemClassLoader();
}
return new URLClassLoader(classPath.toArray(new URL[0]), parent);
}

  代码很简单,废话就不多说了。调用例子如下:

EJob.addClasspath("/usr/lib/hadoop-0.20/conf");
ClassLoader classLoader
= EJob.getClassLoader();
Thread.currentThread().setContextClassLoader(classLoader);

  设置好了类加载器,下面还有一步就是要打包Jar文件,就是让Project自打包自己的class为一个Jar包,我这里以标准Eclipse工程文件夹布局为例,打包的就是bin文件夹里的class。

public static File createTempJar(String root) throws IOException {
if (!new File(root).exists()) {
return null;
}
Manifest manifest
= new Manifest();
manifest.getMainAttributes().putValue(
"Manifest-Version", "1.0");
final File jarFile = File.createTempFile("EJob-", ".jar", new File(System
.getProperty(
"java.io.tmpdir")));

Runtime.getRuntime().addShutdownHook(
new Thread() {
public void run() {
jarFile.delete();
}
});

JarOutputStream out
= new JarOutputStream(new FileOutputStream(jarFile),
manifest);
createTempJarInner(out,
new File(root), "");
out.flush();
out.close();
return jarFile;
}

private static void createTempJarInner(JarOutputStream out, File f,
String base)
throws IOException {
if (f.isDirectory()) {
File[] fl
= f.listFiles();
if (base.length() > 0) {
base
= base + "/";
}
for (int i = 0; i < fl.length; i++) {
createTempJarInner(out, fl[i], base
+ fl[i].getName());
}
}
else {
out.putNextEntry(
new JarEntry(base));
FileInputStream in
= new FileInputStream(f);
byte[] buffer = new byte[1024];
int n = in.read(buffer);
while (n != -1) {
out.write(buffer,
0, n);
n
= in.read(buffer);
}
in.close();
}
}

相关文章: