【问题标题】:How to use org.slf4j.Logger in spark?如何在火花中使用 org.slf4j.Logger?
【发布时间】:2019-04-30 10:35:03
【问题描述】:

我正在尝试在 spark 中使用 org.slf4j.Logger。如果我写如下,我会得到non-static field cannot be referenced from a static context的错误。因为方法main 是静态的,而logger 是非静态的。

import org.apache.spark.api.java.*;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.broadcast.Broadcast;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class simpleApp {
    private final Logger logger = LoggerFactory.getLogger(getClass());

    public static void main(String[] args) {
        String logFile = "/user/beibei/zhaokai/spark_java/a.txt"; // Should be some file on your system
        SparkConf conf = new SparkConf().setAppName("Simple Application");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaRDD<String> logData = sc.textFile(logFile).cache();

        logger.info("loading graph from cache");

        long numAs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("a"); }
        }).count();

        long numBs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("t"); }
        }).count();

        System.out.println("Lines with a: " + numAs + ", lines with t: " + numBs);
    }
}

但是,如果我这样写。我会再买一个

线程“主”org.apache.spark.SparkException 中的错误异常:任务 不可序列化。

因为simpleApp类的对象是不可序列化的。

import org.apache.spark.api.java.*;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.Function;
import org.apache.spark.broadcast.Broadcast;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class simpleApp {
    private final Logger logger = LoggerFactory.getLogger(getClass());

    public static void main(String[] args) {
        new simpleApp().start();
    }

    private void start() {
        String logFile = "/path/a.txt"; // Should be some file on your system
        SparkConf conf = new SparkConf().setAppName("Simple Application");
        JavaSparkContext sc = new JavaSparkContext(conf);
        JavaRDD<String> logData = sc.textFile(logFile).cache();

        logger.info("loading graph from cache");

        long numAs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("a"); }
        }).count();

        long numBs = logData.filter(new Function<String, Boolean>() {
            public Boolean call(String s) { return s.contains("t"); }
        }).count();

        System.out.println("Lines with a: " + numAs + ", lines with t: " + numBs);
    }
}

那我该怎么办?
如果我想使用其他一些包,比如org.slf4j.Logger,我会遇到同样的问题吗?

【问题讨论】:

  • 那么让logger 成为simpleApp 类的静态成员怎么样?例如LoggerFactory.getLogger(simpleApp.class)?

标签: java apache-spark slf4j


【解决方案1】:

可能有几个选项可用....我会提供由 spark 提供的org.apache.spark.internal.Logging(>=2.2 版本的 spark)。

文档说:

/**
 * Utility trait for classes that want to log data. Creates a SLF4J logger for the class and allows
 * logging messages at different levels using methods that only evaluate parameters lazily if the
 * log level is enabled.
 */

private def isLog4j12(): Boolean = {
// This distinguishes the log4j 1.2 binding, currently
// org.slf4j.impl.Log4jLoggerFactory, from the log4j 2.0 binding, currently
// org.apache.logging.slf4j.Log4jLoggerFactory
val binderClass = StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr
"org.slf4j.impl.Log4jLoggerFactory".equals(binderClass)
 }

如果你想在不使用 spark 提供的 api 的情况下自己做同样的事情,你可以模仿。

注意:在上述方法中.. 调整日志级别使用 sc.setLogLevel(newLevel)。对于 SparkR,使用 setLogLevel(newLevel)。

也可以看看:apache-spark-logging-within-scala

【讨论】:

    猜你喜欢
    • 2017-02-26
    • 1970-01-01
    • 2018-08-09
    • 2020-04-25
    • 1970-01-01
    • 2021-10-23
    • 1970-01-01
    • 2017-01-29
    • 1970-01-01
    相关资源
    最近更新 更多