【问题标题】:fgets() blocking when buffer too large缓冲区太大时 fgets() 阻塞
【发布时间】:2014-11-04 22:58:40
【问题描述】:

我目前正在使用 select() 来判断文件描述符中是否有要读取的数据,因为我不希望 fgets 阻塞。这在 99% 的情况下都有效,但是,如果 select() 检测到 fp 中的数据,但数据没有换行符并且数据小于我的缓冲区,它将阻塞。有没有办法告诉有多少字节准备被读取?

    //See if there is data in fp, waiting up to 5 seconds
    select_rv = checkForData(fp, 5);

    //If there is data in fp...
    if (select_rv > 0)
    {
        //Blocks if data doesn't have a newline and the data in fp is smaller than the size of command_out!!
        if (fgets(command_out, sizeof(command_out)-1, fp) != NULL)
        {
            printf("WGET: %s", command_out);
        }
    }
    else if (select_rv == 0)
    {
        printf("select() timed out... No output from command process!\n");
    }

我想我真正想要的是在调用 fgets 之前知道是否准备好读取整行。

【问题讨论】:

  • select 作用于文件描述符,但fgets 作用于FILE *
  • 如果你想要非阻塞 I/O,你可能不应该使用 fgets; see here 了解详情
  • 在同一文件中同时混合非缓冲 I/O (select()here) 和缓冲 I/O (fgets()) 来处理相同的数据并不是一个好主意。

标签: c select fgets


【解决方案1】:

正如 MBlanc 所提到的,使用 read() 实现您自己的缓冲是可行的方法。

这是一个演示通用方法的程序。我不建议这样做,因为:

  1. 此处介绍的函数使用静态变量,并且仅适用于单个文件,并且一旦结束将无法使用。实际上,您可能希望为每个文件设置一个单独的 struct 并将每个文件的状态存储在其中,每次都将其传递给您的函数。

  2. 这通过简单地memmove()ing 从缓冲区中删除一些数据后的剩余数据来维护缓冲区。实际上,实现循环队列可能是更好的方法,尽管基本用法是相同的。

  3. 如果此处的输出缓冲区大于内部缓冲区,则它永远不会使用该额外空间。实际上,如果您遇到这种情况,您要么调整内部缓冲区的大小,要么将内部缓冲区复制到输出字符串中,然后返回并在返回之前尝试第二次 read() 调用。

    李>

但是实现所有这些会给示例程序增加太多的复杂性,这里的一般方法将展示如何完成任务。

为了模拟接收输入的延迟,主程序将通过管道输出以下程序的输出,该程序只输出几次,有时带有换行符,有时没有,在输出之间有sleep()s:

delayed_output.c:

#define _POSIX_C_SOURCE 200809L

#include <stdio.h>
#include <unistd.h>

int main(void)
{
    printf("Here is some input...");
    fflush(stdout);

    sleep(3);

    printf("and here is some more.\n");
    printf("Yet more output is here...");
    fflush(stdout);

    sleep(3);

    printf("and here's the end of it.\n");
    printf("Here's some more, not ending with a newline. ");
    printf("There are some more words here, to exceed our buffer.");
    fflush(stdout);

    return 0;
}

主程序:

buffer.c:

#define _POSIX_C_SOURCE 200809L

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
#include <stdarg.h>
#include <unistd.h>
#include <sys/select.h>

#define INTBUFFERSIZE 1024
#define BUFFERSIZE 60
#define GET_LINE_DEBUG true

/*  Prints a debug message if debugging is on  */

void get_line_debug_msg(const char * msg, ...)
{
    va_list ap;
    va_start(ap, msg);
    if ( GET_LINE_DEBUG ) {
        vfprintf(stderr, msg, ap);
    }
    va_end(ap);
}

/*
 *  Gets a line from a file if one is available.
 *
 *  Returns:
 *    1 if a line was successfully gotten
 *    0 if a line is not yet available
 *    -1 on end-of-file (no more input available)
 *
 *  NOTE: This function can be used only with one file, and will
 *  be unusable once that file has reached the end.
 */

int get_line_if_ready(int fd, char * out_buffer, const size_t size)
{
    static char int_buffer[INTBUFFERSIZE + 1] = {0};  /*  Internal buffer  */
    static char * back = int_buffer;    /*  Next available space in buffer */
    static bool end_of_file = false;

    if ( !end_of_file ) {

        /*  Check if input is available  */

        fd_set fds;
        FD_ZERO(&fds);
        FD_SET(fd, &fds);
        struct timeval tv = {0, 0};

        int status;
        if ( (status = select(fd + 1, &fds, NULL, NULL, &tv)) == -1 ) {
            perror("error calling select()");
            exit(EXIT_FAILURE);
        }
        else if ( status == 0 ) {

            /*  Return zero if no input available  */

            return 0;
        }

        /*  Get as much available input as will fit in buffer  */

        const size_t bufferspace = INTBUFFERSIZE - (back - int_buffer) - 1;
        const ssize_t numread = read(fd, back, bufferspace);
        if ( numread == -1 ) {
            perror("error calling read()");
            exit(EXIT_FAILURE);
        }
        else if ( numread == 0 ) {
            end_of_file = true;
        }
        else {
            const char * oldback = back;
            back += numread;
            *back = 0;

            get_line_debug_msg("(In function, just read [%s])\n"
                               "(Internal buffer is [%s])\n",
                               oldback, int_buffer);
        }
    }

    /*  Write line to output buffer if a full line is available,
     *  or if we have reached the end of the file.                */

    char * endptr;
    const size_t bufferspace = INTBUFFERSIZE - (back - int_buffer) - 1;
    if ( (endptr = strchr(int_buffer, '\n')) ||
         bufferspace == 0 ||
         end_of_file ) {
        const size_t buf_len = back - int_buffer;
        if ( end_of_file && buf_len == 0 ) {

            /*  Buffer empty, we're done  */

            return -1;
        }

        endptr = (end_of_file || bufferspace == 0) ? back : endptr + 1;
        const size_t line_len = endptr - int_buffer;
        const size_t numcopy = line_len > (size - 1) ? (size - 1) : line_len;

        strncpy(out_buffer, int_buffer, numcopy);
        out_buffer[numcopy] = 0;
        memmove(int_buffer, int_buffer + numcopy, INTBUFFERSIZE - numcopy);
        back -= numcopy;

        return 1;
    }

    /*  No full line available, and
     *  at end of file, so return 0.  */

    return 0;
}

int main(void)
{
    char buffer[BUFFERSIZE];

    FILE * fp = popen("./delayed_output", "r");
    if ( !fp ) {
        perror("error calling popen()");
        return EXIT_FAILURE;
    }

    sleep(1);       /*  Give child process some time to write output  */

    int n = 0;
    while ( n != -1 ) {

        /*  Loop until we get a line  */

        while ( !(n = get_line_if_ready(fileno(fp), buffer, BUFFERSIZE)) ) {

            /*  Here's where you could do other stuff if no line
             *  is available. Here, we'll just sleep for a while.  */

            printf("Line is not ready. Sleeping for five seconds.\n");
            sleep(5);
        }

        /*  Output it if we're not at end of file  */

        if ( n != -1 ) {
            const size_t len = strlen(buffer);
            if ( buffer[len - 1] == '\n' ) {
                buffer[len - 1] = 0;
            }

            printf("Got line: %s\n", buffer);
        }
    }

    if ( pclose(fp) == -1 ) {
        perror("error calling pclose()");
        return EXIT_FAILURE;
    }

    return 0;
}

和输出:

paul@thoth:~/src/sandbox/buffer$ ./buffer
(In function, just read [Here is some input...])
(Internal buffer is [Here is some input...])
Line is not ready. Sleeping for five seconds.
(In function, just read [and here is some more.
Yet more output is here...])
(Internal buffer is [Here is some input...and here is some more.
Yet more output is here...])
Got line: Here is some input...and here is some more.
Line is not ready. Sleeping for five seconds.
(In function, just read [and here's the end of it.
Here's some more, not ending with a newline. There are some more words here, to exceed our buffer.])
(Internal buffer is [Yet more output is here...and here's the end of it.
Here's some more, not ending with a newline. There are some more words here, to exceed our buffer.])
Got line: Yet more output is here...and here's the end of it.
Got line: Here's some more, not ending with a newline. There are some
Got line:  more words here, to exceed our buffer.
paul@thoth:~/src/sandbox/buffer$ 

【讨论】:

    【解决方案2】:

    有什么方法可以知道有多少字节可以读取?

    在 C99/POSIX 中我不知道。我猜这个功能没有被认为是有用的,因为文件有一个固定的大小(大多数时候,无论如何)。不幸的是,正如您所见,select() 非常简陋。

    我想我真正想要的是在调用 fgets 之前知道是否准备好读取整行。

    fgets() 在循环中缓冲,直到到达'\n'。此操作使用来自底层文件描述符的输入,因此您需要自己实现一个非阻塞版本。

    【讨论】:

    • 当你说“你需要自己实现一个非阻塞版本”时,你是否建议我在 select() 调用之间使用 read() 来缓冲数据,然后每次我完成一个read(),检查'\n'是否存在?
    • @RPGillespie 是的,这正是我的意思:)
    猜你喜欢
    • 2017-04-07
    • 1970-01-01
    • 2011-03-03
    • 2018-04-02
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-12-05
    • 1970-01-01
    相关资源
    最近更新 更多