【问题标题】:Justify text in R在 R 中对齐文本
【发布时间】:2016-04-15 02:47:33
【问题描述】:

如何在 R 中证明文本的合理性?对齐我的意思是段落中的每一行的长度完全相同(就像您在开放式办公室或 excel 中对齐时一样)。我试图找到strwrapcat 的选项,但没有成功。

## Get some sample text example from wikipedia api
library(httr)
library(xml2)
name <- "Invictus"
url <- URLencode(sprintf("https://en.wikisource.org/w/api.php?action=parse&prop=text&page=%s&format=json", name))
res <- read_html(content(GET(url))$parse$text[[1]])
string <- iconv(xml_text(xml_find_all(res, "//p"), trim=TRUE), "latin1", "ASCII", sub=" ")[1:2]
(string <- trimws(gsub('\\n|\\s{3,}', ' ', paste(string, collapse=" "))))
# [1] "Out of the night that covers me, Black as the pit from pole to pole, I thank whatever gods may be For my unconquerable soul.  In the fell clutch of circumstance I have not winced nor cried aloud. Under the bludgeonings of chance My head is bloody, but unbow'd.  Beyond this place of wrath and tears Looms but the Horror of the shade, And yet the menace of the years Finds and shall find me unafraid.  It matters not how strait the gate, How charged with punishments the scroll, I am the master of my fate: I am the captain of my soul."

使用上述功能的一些尝试

## Using these I can get left/right/center justified text but not
## justified like in other text editing programs or newspapers.
width <- 30
cat(paste(strwrap(string, width=width), collapse='\n'))

## Or with cat
tokens <- strsplit(string, '\\s+')[[1]]               # tokenise to pass to cat
out <- capture.output(cat(tokens, fill=width, sep=" "))  # strings <= width chars
cat(paste(out, collapse='\n'))

【问题讨论】:

  • 使用 TeX 渲染文本,例如通过 pandoc 或 tikzDevice 可能是你最好的选择
  • 如果渲染成 HTML,你可以使用 &lt;p style='text-align:justify;'&gt;your text here&lt;/p&gt; 标签集。见w3schools.com/cssref/pr_text_text-align.asp。如果您尝试在控制台中执行此操作,我祝您好运,但怀疑有什么可以帮助您的。
  • 这里有一些有用的信息:shiny.rstudio.com/articles/css.html。正如本杰明所说,你只需要在你的css中指定text-align: justify

标签: r text justify


【解决方案1】:

好吧,如果没有内置方法,这对我的目的来说已经足够好了。感谢上面关于如何使用 html 样式的 cmets。

justify <- function(string, width=getOption('width'), 
                    fill=c('random', 'right', 'left')) {
    strs <- strwrap(string, width=width)
    paste(fill_spaces(strs, width, match.arg(fill)), collapse="\n")
}

fill_spaces <- function(lines, width, fill) {
    tokens <- strsplit(lines, '\\s+')
    res <- lapply(head(tokens, -1L), function(x) {
        nspace <- length(x)-1L
        extra <- width - sum(nchar(x)) - nspace
        reps <- extra %/% nspace
        extra <- extra %% nspace
        times <- rep.int(if (reps>0) reps+1L else 1L, nspace)
        if (extra > 0) {
            if (fill=='right') times[1:extra] <- times[1:extra]+1L
            else if (fill=='left') 
                times[(nspace-extra+1L):nspace] <- times[(nspace-extra+1L):nspace]+1L
            else times[inds] <- times[(inds <- sample(nspace, extra))]+1L
        }
        spaces <- c('', unlist(lapply(times, formatC, x=' ', digits=NULL)))
        paste(c(rbind(spaces, x)), collapse='')
    })
    c(res, paste(tail(tokens, 1L)[[1]], collapse = ' '))
}

cat(justify(string, width=40))
# Out  of the night  that covers me, Black
# as  the pit from  pole to pole, I  thank
# whatever   gods    may    be   For    my
# unconquerable soul. In  the fell  clutch
# of  circumstance I have  not  winced nor
# cried  aloud. Under the  bludgeonings of
# chance My  head  is bloody, but unbow'd.
# Beyond this  place  of  wrath and  tears
# Looms but  the Horror of the  shade, And
# yet  the menace of the years  Finds  and
# shall  find me unafraid. It  matters not
# how strait  the  gate,  How charged with
# punishments the scroll,  I am the master
# of  my fate:  I  am  the  captain  of my
# soul.

【讨论】:

    【解决方案2】:

    @jenesaisquoi - 很好的解决方案!但是我发现如果有一个带有空格的段落分隔符或者strs &lt;- strwrap(string, width=width) 返回一个或更少的元素,它就不起作用。

    所以,我找到了一个改进的版本,它首先按段落/换行符拆分字符串,然后应用相同的逻辑:

            justify = function(string, width = getOption('width'), 
                               fill = c('random', 'right', 'left')) {
    
                            # Split text into paragraphs and remove trailing and leading white space. 
                            paragraphs = gsub("^\\s+|\\s+$", "", 
                            unlist(strsplit(x = string, split = "\n", fixed = TRUE)))
    
                            # NOTE: Empty elements are paragraphs break. 
                            paragraphs = paragraphs[nchar(paragraphs) > 0] 
    
                            formatted_text = lapply(paragraphs, function(paragraph){
                                                    strs = strwrap(paragraph, width = width)
                                                    paste(fill_spaces(strs, width, fill), collapse = "\n")
                                             })
    
                            paste0(unlist(formatted_text, recursive = FALSE), collapse = "\n")
             }
    
            fill_spaces = function(lines, width, fill) {
    
              tokens = strsplit(lines, '\\s+')
    
              res = lapply(head(tokens, -1L), function(x) {
                nspace = length(x) - 1L
                extra = width - sum(nchar(x)) - nspace
                reps = extra %/% nspace
                extra = extra %% nspace
                times = rep.int(if (reps > 0) reps + 1L else 1L, nspace)
                if (extra > 0) {
                  if (fill == 'right') times[1:extra] = times[1:extra] + 1L
                  else if (fill == 'left') 
                    times[(nspace - extra + 1L):nspace] = times[(nspace - extra + 1L):nspace] + 1L
                  else times[inds] = times[(inds <- sample(nspace, extra))] + 1L
                }
                spaces <- c('', unlist(lapply(times, formatC, x = ' ', digits = NULL)))
                paste(c(rbind(spaces, x)), collapse = '')
              })
              c(res, paste(tail(tokens, 1L)[[1]], collapse = ' '))
            }
    
            nchar_per_line = 50
            string = "Colin\'s practice outfits have reached a new level recently. It's difficult to determine the effect they are having on his teammates - whether they serve more as a distraction or a nice bit of comice releif, but it is clear they fuel Colin's fire and act as a motivator to him. \n\n                                  On another note, Colin's high fives per 36 have been through the roof recently. It really seems like something he's been focusing on in practice lately, as well as putting extra reps in at the gym. \n\n                                  Keep an eye out for Colin, could be an interesting 10 day pickup down the line."
            cat(justify(string, width = nchar_per_line))
    
    
    
        # Colin's practice outfits have reached  a new level
        # recently. It's difficult  to determine the  effect
        # they are  having on his teammates  -  whether they
        # serve  more  as a  distraction  or  a nice bit  of
        # comice releif, but  it  is clear they fuel Colin's
        # fire and act as a motivator to him.
        # On another note, Colin's  high  fives per 36  have
        # been  through  the roof recently. It  really seems
        # like  something he's been focusing on in  practice
        # lately,  as well  as putting extra reps in at  the
        # gym.
        # Keep   an   eye   out   for  Colin,  could  be  an
        # interesting 10 day pickup down the line.
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2011-12-26
      • 1970-01-01
      • 1970-01-01
      • 2020-07-12
      • 1970-01-01
      • 2014-02-06
      • 2015-12-31
      相关资源
      最近更新 更多