【发布时间】:2018-09-23 05:27:19
【问题描述】:
我正在用 Erlang 实现一个类似 Twitter 的应用程序。我有它的分布式和非分布式实现。我正在做一个基准测试,但似乎我找不到向每个用户进程发送并行请求以进行分布式实现的方法。我正在使用 lists:foreach 函数将“获取推文”发送到客户端进程列表。我的理解是 lists:foreach 函数一次进入列表的每个元素,实现了最终使我的分布式的顺序行为实现导致与非分布式实现相同的执行时间。是否可以一次将“获取推文”请求发送到不同的客户端进程?这对我来说似乎是一个相当具体的案例,在 StackOverflow 内外都很难找到解决方案。
test_get_tweets_Bench() ->
{ServerPid, UserInfos} = initializeForBench_server(),
run_benchmark("timeline",
fun () ->
lists:foreach(fun (_) ->
UserChoice = pick_random(UserInfos),
server:get_tweets(element(2, UserChoice), element(1, UserChoice), 1)
end,
lists:seq(1, 10000))
end,
30).
pick_random(List) ->
lists:nth(rand:uniform(length(List)), List).
userinfos 是以下形式的列表:[{userId,client_process},...]
在尝试 rpc:pmap 而不是 lists:foreach 之后,我的基准测试已经慢了大约 3 倍。变化如下:
test_get_tweets_Bench2() ->
{ServerPid, UserInfos} = initializeForBench_server(),
run_benchmark("get_tweets 2",
fun () ->
rpc:pmap({?MODULE,do_apply},
[fun (_) ->
UserChoice = pick_random(UserInfos),
server:get_tweets(element(2, UserChoice), element(1, UserChoice), 1)
end],
lists:seq(1, 10000))
end,
30).
pick_random(List) ->
lists:nth(rand:uniform(length(List)), List).
do_apply(X,F)->
F(X).
我认为 rpc:pmap 会使我的基准测试更快,因为它会并行发送 get_tweet 请求。
下面是我的服务器模块,它是我的基准测试和类似 Twitter 的应用程序之间的 API。 API 将来自我的基准测试的请求发送到我的类似 Twitter 的应用程序。
%% This module provides the protocol that is used to interact with an
%% implementation of a microblogging service.
%%
%% The interface is design to be synchrounous: it waits for the reply of the
%% system.
%%
%% This module defines the public API that is supposed to be used for
%% experiments. The semantics of the API here should remain unchanged.
-module(server).
-export([register_user/1,
subscribe/3,
get_timeline/3,
get_tweets/3,
tweet/3]).
%%
%% Server API
%%
% Register a new user. Returns its id and a pid that should be used for
% subsequent requests by this client.
-spec register_user(pid()) -> {integer(), pid()}.
register_user(ServerPid) ->
ServerPid ! {self(), register_user},
receive
{ResponsePid, registered_user, UserId} -> {UserId, ResponsePid}
end.
% Subscribe/follow another user.
-spec subscribe(pid(), integer(), integer()) -> ok.
subscribe(ServerPid, UserId, UserIdToSubscribeTo) ->
ServerPid ! {self(), subscribe, UserId, UserIdToSubscribeTo},
receive
{_ResponsePid, subscribed, UserId, UserIdToSubscribeTo} -> ok
end.
% Request a page of the timeline of a particular user.
% Request results can be 'paginated' to reduce the amount of data to be sent in
% a single response. This is up to the server.
-spec get_timeline(pid(), integer(), integer()) -> [{tweet, integer(), erlang:timestamp(), string()}].
get_timeline(ServerPid, UserId, Page) ->
ServerPid ! {self(), get_timeline, UserId, Page},
receive
{_ResponsePid, timeline, UserId, Page, Timeline} ->
Timeline
end.
% Request a page of tweets of a particular user.
% Request results can be 'paginated' to reduce the amount of data to be sent in
% a single response. This is up to the server.
-spec get_tweets(pid(), integer(), integer()) -> [{tweet, integer(), erlang:timestamp(), string()}].
get_tweets(ServerPid, UserId, Page) ->
ServerPid ! {self(), get_tweets, UserId, Page},
receive
{_ResponsePid, tweets, UserId, Page, Tweets} ->
Tweets
end.
% Submit a tweet for a user.
% (Authorization/security are not regarded in any way.)
-spec tweet(pid(), integer(), string()) -> erlang:timestamp().
tweet(ServerPid, UserId, Tweet) ->
ServerPid ! {self(), tweet, UserId, Tweet},
receive
{_ResponsePid, tweet_accepted, UserId, Timestamp} ->
Timestamp
end.
【问题讨论】:
-
我认为对于 Erlang 中的并行请求,您应该使用并行映射。您可以在这里查看 pmap:stackoverflow.com/questions/7595128/….
-
在将我的 lists:foreach 更改为 rpc:pmap 之后,我的基准测试现在运行得比以前慢(大约慢 2 -3 倍)。我认为这种更改会加快速度,因为它将请求并行发送到分布式进程。
标签: concurrency parallel-processing erlang