【发布时间】:2016-10-03 05:51:57
【问题描述】:
我目前正在尝试编写一个 C 程序,该程序利用 pthreads 的多线程来计算文本文件中字符的出现次数,通过命令行参数发送,使用 64kb 缓冲区。我将文件划分为 8 个分区,用于 8 个线程。我对 C 和多线程很陌生,所以这超出了我的想象。
该程序正在计算字符但不正确,每次运行它都会得到不同的结果。这是我的代码(更新)
#include <pthread.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#define BUFFER_SIZE 65536
#define NUMBER_OF_THREADS 8
#define NUM_CHARS 127
int charCount[NUM_CHARS + 1][8];
void* countChar(void *arg);
struct bufferPartition {
unsigned char* start;
int size;
int index;
};
int main(int argc, char *argv[]){
pthread_t tid[NUMBER_OF_THREADS];
pthread_attr_t attr[NUMBER_OF_THREADS];
size_t fileSize;
unsigned char* buffer = (unsigned char *) malloc(BUFFER_SIZE);
unsigned int bufferPartitionSize;
printf("%i", argc);
if(argc != 2){
fprintf(stderr,"usage: a.out <integer value>\n");
return -1;
}
FILE* fp = fopen(argv[1], "r+");
if(fp == NULL){
printf("Error! Could not open the file.");
return -1;
}
fileSize = fread(buffer, 1, BUFFER_SIZE,fp);
fclose(fp);
if(fileSize % 8 != 0){
bufferPartitionSize = ((8 - (fileSize % 8)) + fileSize) / 8;
}else{
bufferPartitionSize = fileSize / 8;
}
for(int index = 0; index < NUMBER_OF_THREADS; index++){
struct bufferPartition* bufferPartition = (struct bufferPartition*)malloc(sizeof(struct bufferPartition));
bufferPartition -> size = bufferPartitionSize;
bufferPartition -> start = buffer + (index * (bufferPartition -> size));
bufferPartition -> index = index + 1;
pthread_attr_init(&attr[index]);
pthread_create(&tid[index], &attr[index], countChar, bufferPartition);
}
for(int index = 0; index <= NUMBER_OF_THREADS; index++){
pthread_join(tid[index], NULL);
}
for(unsigned int i = 0; i <= 128; i++){
for(unsigned int k = 1; k <= NUMBER_OF_THREADS; k++){
charCount[i][0] += charCount[i][k];
}
if(i < 32){
printf("%i occurrences of 0x%x\n", charCount[i][0], i);
}else{
printf("%i occurrences of %c\n",charCount[i][0], i);
}
}
return 0;
}
void* countChar(void *arg){
struct bufferPartition* bufferPartition = (struct bufferPartition*) arg;
unsigned int character;
int threadNumber = bufferPartition->index;
for(int index = 0; index < bufferPartition -> size; index++){
character = bufferPartition -> start[index];
(charCount[character][threadNumber])++;
}
}
【问题讨论】:
-
您的代码可能会出现未定义的行为。如果系统的默认字符是无符号的,则字符的范围可能是
0到255,或者如果系统的默认字符类型是有符号的,则可能是-128到127。这两者都可能超出您的代码可接受的0到127范围。最简单的解决方案是使用unsigned char *作为缓冲区,使用UCHAR_MAX + 1作为charCount数组中的元素数量:int charCount[UCHAR_MAX + 1][NUMBER_OF_THREADS];使用带符号的int进行索引可能会使事情变得更糟,如果从 char 转换符号扩展。 -
知道了。我这样做了,代码仍在计算文件中没有的字符。
-
file_size得到什么?你能发布你更新的代码吗? -
已更新。 file_size 在我的文件中显示为正确的字符数。
标签: c multithreading